Subblock coding by generalized intra prediction in video coding

ABSTRACT

A video coder may be configured to code video data by performing splitting of a coding unit (CU) of video data using intra sub-partition (ISP) to form a set of prediction blocks. The video coder may group a plurality of the prediction blocks from the set of prediction blocks into a first prediction block group (PBG). The video coder may reconstruct samples of prediction blocks included in the first PBG independently of samples of other prediction blocks included in the first PBG.

This application claims the benefit of U.S. Provisional PatentApplication No. 62/812,078, filed Feb. 28, 2019, U.S. Provisional PatentApplication No. 62/817,474, filed Mar. 12, 2019, and U.S. ProvisionalPatent Application No. 62/862,936, filed Jun. 18, 2019, the entirecontent of each of which is incorporated by reference.

TECHNICAL FIELD

This disclosure relates to video encoding and video decoding.

BACKGROUND

Digital video capabilities can be incorporated into a wide range ofdevices, including digital televisions, digital direct broadcastsystems, wireless broadcast systems, personal digital assistants (PDAs),laptop or desktop computers, tablet computers, e-book readers, digitalcameras, digital recording devices, digital media players, video gamingdevices, video game consoles, cellular or satellite radio telephones,so-called “smart phones,” video teleconferencing devices, videostreaming devices, and the like. Digital video devices implement videocoding techniques, such as those described in the standards defined byMPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, Part 10, Advanced VideoCoding (AVC), ITU-T H.265/High Efficiency Video Coding (HEVC), andextensions of such standards. The video devices may transmit, receive,encode, decode, and/or store digital video information more efficientlyby implementing such video coding techniques.

Video coding (e.g., video encoding and/or video decoding) typicallyinvolves predicting a block of video data from either an already codedblock of video data in the same picture (e.g., intra prediction) or analready coded block of video data in a different picture (e.g., interprediction). Video coding techniques include spatial (intra-picture)prediction and/or temporal (inter-picture) prediction to reduce orremove redundancy inherent in video sequences. For block-based videocoding, a video slice (e.g., a video picture or a portion of a videopicture) may be partitioned into video blocks, which may also bereferred to as coding tree units (CTUs), coding units (CUs) and/orcoding nodes. Video blocks in an intra-coded (I) slice of a picture areencoded using spatial prediction with respect to reference samples inneighboring blocks in the same picture. Video blocks in an inter-coded(P or B) slice of a picture may use spatial prediction with respect toreference samples in neighboring blocks in the same picture or temporalprediction with respect to reference samples in other referencepictures. Pictures may be referred to as frames, and reference picturesmay be referred to as reference frames.

SUMMARY

In general, this disclosure describes techniques related to the codingof multiple prediction blocks into prediction block groups using intrasub-partition (ISP) coding. In particular, a video coder (e.g., videoencoder and/or video decoder) may group multiple prediction blocks, suchas multiple subblocks of a coding unit, into prediction block groups.For example, a single prediction block group may include more than oneprediction block or subblock of the coding unit. In one example, a firstprediction block group may include a first subblock of a coding unit anda second subblock of the coding unit.

In some examples, a video coder may reconstruct samples of predictionblocks included in the first prediction block group independently of thereconstruction of samples of other prediction blocks included in thatsame prediction block group. For example, the video coder mayreconstruct samples of one prediction block before, without, or inparallel with, the reconstruction of samples of another prediction blockincluded in the same prediction block group. In some instances, theprediction block grouping techniques of this disclosure may beimplemented when instances of intra sub-partition coding are used toinclude splitting of coding units, such as vertical splitting,horizontal splitting, a combination of horizontal splitting and verticalsplitting, etc. Furthermore, the prediction block grouping techniques ofthis disclosure may be implemented when the prediction block has arelatively narrow dimension (e.g., narrow height, narrow width, etc.),such that the dimensions of the prediction block satisfy a predefinedthreshold for prediction block grouping. One or more techniquesdescribed herein for grouping prediction blocks into prediction blockgroups, may be applied to any of the existing video codecs, such as HEVC(High Efficiency Video Coding) or H.266/Versatile Video Coding (VVC)standard, or may be used in any future video coding standards.

According to one example, a method of coding video data is disclosed.The method comprises performing splitting of a coding unit (CU) of videodata using intra sub-partition (ISP) to form a set of prediction blocks,the prediction blocks including at least a first prediction block and asecond prediction block; grouping a plurality of prediction blocks fromthe set of prediction blocks into a first prediction block group (PBG);and reconstructing samples of prediction blocks included in the firstPBG independently of samples of other prediction blocks included in thefirst PBG.

According to another example, a device for coding video data isdisclosed. The device includes a memory configured to store video dataand one or more processors implemented in circuitry and configured toperform splitting of a coding unit (CU) of video data using intrasub-partition (ISP) to form a set of prediction blocks, the predictionblocks including at least a first prediction block and a secondprediction block; group a plurality of prediction blocks from the set ofprediction blocks into a first prediction block group (PBG); andreconstruct samples of prediction blocks included in the first PBGindependently of samples of other prediction blocks included in thefirst PBG.

According to another example, a non-transitory computer-readable storagemedium is disclosed. The non-transitory computer-readable storage mediumhas stored thereon instructions that, when executed, cause one or moreprocessors to: perform splitting of a coding unit (CU) of video datausing intra sub-partition (ISP) to form a set of prediction blocks, theprediction blocks including at least a first prediction block and asecond prediction block; group a plurality of prediction blocks from theset of prediction blocks into a first prediction block group (PBG); andreconstruct samples of prediction blocks included in the first PBGindependently of samples of other prediction blocks included in thefirst PBG.

According to another example, an apparatus for coding video data isdisclosed. The apparatus includes means for performing splitting of acoding unit (CU) of video data using intra sub-partition (ISP) to form aset of prediction blocks, the prediction blocks including at least afirst prediction block and a second prediction block; grouping aplurality of prediction blocks from the set of prediction blocks into afirst prediction block group (PBG); and reconstructing samples ofprediction blocks included in the first PBG independently of samples ofother prediction blocks included in the first PBG.

The details of one or more examples are set forth in the accompanyingdrawings and the description below. Other features, objects, andadvantages of the techniques described in this disclosure will beapparent from the description, drawings, and claims.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram illustrating an example video encoding anddecoding system that may perform the techniques of this disclosure.

FIG. 2 is a diagram illustrating example intra prediction modes.

FIG. 3 is a block diagram illustrating an example of an 8×4 rectangularblock.

FIGS. 4A-4C are block diagrams providing illustrations of mode mappingprocess(es) for modes outside the diagonal direction range.

FIG. 5 is a block diagram illustrating wide-angles that are adopted inVTM2.

FIG. 6 is a block diagram illustrating wider-angles that are adopted inVTM3.

FIG. 7 is a table that specifies the mapping table between predModeIntraand the angle parameter intraPredAngle in VTM3.

FIG. 8 is a block diagram illustrating example divisions of blocks forintra sub-partition coding.

FIG. 9 is a block diagram illustrating example divisions of blocks forintra sub-partition coding.

FIG. 10 is a block diagram illustrating multiple reference lineprediction.

FIG. 11 is a conceptual diagram illustrating an example of how a codingblock may be split into four horizontal planar prediction blocks.

FIG. 12 is a conceptual diagram illustrating an example of how a codingblock may be split into four vertical planar prediction blocks.

FIG. 13 is a conceptual diagram illustrating an example of such a planarprediction block structure.

FIG. 14 is a block diagram illustrating how the coding block is splitinto four prediction blocks.

FIG. 15 is a block diagram illustrating an example where a 4×N codingblock is coded with intra sub-partitions and vertical partitioning.

FIG. 16 is a conceptual diagram illustrating an example where an M×Ncoding block is split into four prediction blocks.

FIG. 17 is a conceptual diagram illustrating an example in which only avertical intra mode prediction is allowed for 4×N coding blocks that arecoded with intra sub-partition coding (ISP) and vertical split.

FIG. 18 is a conceptual diagram illustrating an example in which an 8×N(N>4) coding block coded with ISP and split vertically.

FIGS. 19A and 19B are conceptual diagrams illustrating an examplequadtree binary tree (QTBT) structure, and a corresponding coding treeunit (CTU).

FIG. 20 is a block diagram illustrating an example video encoder thatmay perform the techniques of this disclosure.

FIG. 21 is a block diagram illustrating an example video decoder thatmay perform the techniques of this disclosure.

FIG. 22 is a flowchart illustrating an example method for encoding acurrent block.

FIG. 23 is a flowchart illustrating an example method for decoding acurrent block.

FIG. 24 is a flowchart illustrating an example method for performingprediction block grouping on a current block of video data.

DETAILED DESCRIPTION

In general, this disclosure describes techniques related to improvementsto the coding and/or decoding of prediction blocks using predictionblock groups in intra sub-partition (ISP) coding. Such techniques may beapplied to current or future video coding standards, including theVersatile Video Coding (VVC) standard presently under development.

In some examples, a video coder may code blocks of video data using ISPcoding. For example, the video coder may partition blocks of video datainto subblocks (e.g., prediction blocks). In some examples, thesubblocks may have a relatively narrow size in one or more dimensions.In an illustrative example, a coding unit (CU), such as a coding block,may have the dimensions of Width (W)×Height (H). In one example, the Wof the CU may be equal to 4 and H may be equal to N>0. In theillustrative example, a video coder may code the CU of 4×N using ISPcoding. For example, the video coder may use vertical splitting orhorizontal splitting to partition the CU. In another example, the videocoder may use a combination of vertical splitting and horizontalsplitting. In such instances, the video coder may partition a first oneor more parts of the CU using vertical splitting and a second one ormore parts of the same CU using horizontal splitting.

In an example involving vertical splitting, the resultant subblocks mayhave a size of 1×N. In addition, N may be greater than 8 (e.g., H isgreater than 8). In such examples, the video coder may utilize row-wisestorage of samples. When a video coder uses row-wise storage of samples,narrow subblocks, such as those having a dimension below a predefinedthreshold, may result in the video coder performing inefficient andtime-consuming memory access procedures. For example, in such instances,the video coder, when performing a memory access procedure, may thenaccess (e.g., read) a higher number of samples from memory compared tothe number of samples in the subblock. When a block is of size M×N,accessing the M samples in each row from memory may often involve anoverhead, such as accessing more than M samples due to limitations onthe minimum number of samples per access. For example, a given memoryaccess procedure may require a video coder to read samples of a firstrow before reading samples from another row. In such examples, the videocoder may then read a minimum of X samples before accessing a targetsample in one narrow subblock, where X is greater than the number ofsamples in that narrow subblock. Due to the narrow shape of 1×N and 2×Nsubblocks, this problem is exacerbated as the number of rows is largerfor narrow subblocks. For example, a 4×4 CU may only have four rows. Insuch examples, a video coder may need to access 16 rows for a 1×16subblock, even though the subblock has the same number of samples as the4×4 CU. The resultant overhead may cause a video coder to experiencedelays during the coding process, particularly in the worst case wherethere are several small coding blocks in high resolution sequences. Inaddition, the reconstruction of a subblock in the CU is dependent on thereconstruction of other subblocks in the coding block that precede indecoding order. In such instances, a video coder is likely to experiencean increased delay in coding or decoding the CU due to subblocks of thisnature.

The aforementioned issues, among others, may be addressed by thedisclosed prediction block grouping techniques by grouping predictionblocks (e.g., subblocks) into prediction block groups (PBGs) andreconstructing samples of prediction blocks included in one PBGindependently of samples of other prediction blocks included in the samePBG. Specifically, a coding unit may be split vertically, horizontally,or a combination of horizontally and vertically, using intrasub-partition (ISP) to form the prediction blocks that may then begrouped into various PBGs.

Various techniques in this disclosure may be described with reference toa video coder or with reference to video coding, which are intended tobe generic terms that can refer to either a video encoder and videoencoding or a video decoder. Unless explicitly stated otherwise, itshould not be assumed that techniques described with respect to a videoencoder or a video decoder cannot be performed by the other of a videoencoder or a video decoder. For example, in many instances, a videodecoder performs the same, or sometimes a reciprocal, coding techniqueas a video encoder in order to decode encoded video data. In manyinstances, a video encoder also includes a video decoding loop, and thusthe video encoder performs video decoding as part of encoding videodata. Thus, unless stated otherwise, the techniques described in thisdisclosure with respect to a video decoder may also be performed by avideo encoder, and vice versa.

FIG. 1 is a block diagram illustrating an example video encoding anddecoding system 100 that may perform the techniques of this disclosure.The techniques of this disclosure are generally directed to coding(encoding and/or decoding) video data. In general, video data includesany data for processing a video. Thus, video data may include raw,uncoded video, encoded video, decoded (e.g., reconstructed) video, andvideo metadata, such as signaling data.

As shown in FIG. 1 , system 100 includes a source device 102 thatprovides encoded video data to be decoded and displayed by a destinationdevice 116, in this example. In particular, source device 102 providesthe video data to destination device 116 via a computer-readable medium110. Source device 102 and destination device 116 may comprise any of awide range of devices, including desktop computers, notebook (i.e.,laptop) computers, tablet computers, set-top boxes, telephone handsetssuch as smartphones, televisions, cameras, display devices, digitalmedia players, video gaming consoles, video streaming devices, or thelike. In some cases, source device 102 and destination device 116 may beequipped for wireless communication, and thus may be referred to aswireless communication devices.

In the example of FIG. 1 , source device 102 includes video source 104,memory 106, video encoder 200, and output interface 108. Destinationdevice 116 includes input interface 122, video decoder 300, memory 120,and display device 118. In accordance with this disclosure, videoencoder 200 of source device 102 and video decoder 300 of destinationdevice 116 may be configured to apply the techniques for generalizedintra prediction. Thus, source device 102 represents an example of avideo encoding device, while destination device 116 represents anexample of a video decoding device. In other examples, a source deviceand a destination device may include other components or arrangements.For example, source device 102 may receive video data from an externalvideo source, such as an external camera. Likewise, destination device116 may interface with an external display device, rather than includingan integrated display device.

System 100 as shown in FIG. 1 is merely one example. In general, anydigital video encoding and/or decoding device may perform techniques forgeneralized intra prediction. Source device 102 and destination device116 are merely examples of such coding devices in which source device102 generates coded video data for transmission to destination device116. This disclosure refers to a “coding” device as a device thatperforms coding (encoding and/or decoding) of data. Thus, video encoder200 and video decoder 300 represent examples of coding devices, inparticular, a video encoder and a video decoder, respectively. In someexamples, devices 102, 116 may operate in a substantially symmetricalmanner such that each of devices 102, 116 include video encoding anddecoding components. Hence, system 100 may support one-way or two-wayvideo transmission between video devices 102, 116, e.g., for videostreaming, video playback, video broadcasting, or video telephony.

In general, video source 104 represents a source of video data (i.e.,raw, uncoded video data) and provides a sequential series of pictures(also referred to as “frames”) of the video data to video encoder 200,which encodes data for the pictures. Video source 104 of source device102 may include a video capture device, such as a video camera, a videoarchive containing previously captured raw video, and/or a video feedinterface to receive video from a video content provider. As a furtheralternative, video source 104 may generate computer graphics-based dataas the source video, or a combination of live video, archived video, andcomputer-generated video. In each case, video encoder 200 encodes thecaptured, pre-captured, or computer-generated video data. Video encoder200 may rearrange the pictures from the received order (sometimesreferred to as “display order”) into a coding order for coding. Videoencoder 200 may generate a bitstream including encoded video data.Source device 102 may then output the encoded video data via outputinterface 108 onto computer-readable medium 110 for reception and/orretrieval by, e.g., input interface 122 of destination device 116.

Memory 106 of source device 102 and memory 120 of destination device 116represent general purpose memories. In some example, memories 106, 120may store raw video data, e.g., raw video from video source 104 and raw,decoded video data from video decoder 300. Additionally oralternatively, memories 106, 120 may store software instructionsexecutable by, e.g., video encoder 200 and video decoder 300,respectively. Although memory 106 and memory 120 are shown separatelyfrom video encoder 200 and video decoder 300 in this example, it shouldbe understood that video encoder 200 and video decoder 300 may alsoinclude internal memories for functionally similar or equivalentpurposes. Furthermore, memories 106, 120 may store encoded video data,e.g., output from video encoder 200 and input to video decoder 300. Insome examples, portions of memories 106, 120 may be allocated as one ormore video buffers, e.g., to store raw, decoded, and/or encoded videodata.

Computer-readable medium 110 may represent any type of medium or devicecapable of transporting the encoded video data from source device 102 todestination device 116. In one example, computer-readable medium 110represents a communication medium to enable source device 102 totransmit encoded video data directly to destination device 116 inreal-time, e.g., via a radio frequency network or computer-basednetwork. Output interface 108 may modulate a transmission signalincluding the encoded video data, and input interface 122 may demodulatethe received transmission signal, according to a communication standard,such as a wireless communication protocol. The communication medium maycomprise any wireless or wired communication medium, such as a radiofrequency (RF) spectrum or one or more physical transmission lines. Thecommunication medium may form part of a packet-based network, such as alocal area network, a wide-area network, or a global network such as theInternet. The communication medium may include routers, switches, basestations, or any other equipment that may be useful to facilitatecommunication from source device 102 to destination device 116.

In some examples, computer-readable medium 110 may include storagedevice 112. Source device 102 may output encoded data from outputinterface 108 to computer-readable medium 110 (e.g., storage device112). Similarly, destination device 116 may access encoded data fromcomputer-readable medium 110 (e.g., storage device 112) via inputinterface 122. Storage device 112 may include any of a variety ofdistributed or locally accessed data storage media such as a hard drive,Blu-ray discs, DVDs, CD-ROMs, flash memory, volatile or non-volatilememory, or any other suitable digital storage media for storing encodedvideo data.

In some examples, computer-readable medium 110 may include file server114 or another intermediate storage device that may store the encodedvideo data generated by source device 102. Source device 102 may outputencoded video data to file server 114 or another intermediate storagedevice that may store the encoded video generated by source device 102.Destination device 116 may access stored video data from file server 114via streaming or download. File server 114 may be any type of serverdevice capable of storing encoded video data and transmitting thatencoded video data to the destination device 116. File server 114 mayrepresent a web server (e.g., for a website), a File Transfer Protocol(FTP) server, a content delivery network device, or a network attachedstorage (NAS) device. Destination device 116 may access encoded videodata from file server 114 through any standard data connection,including an Internet connection. This may include a wireless channel(e.g., a Wi-Fi connection), a wired connection (e.g., digital subscriberline (DSL), cable modem, etc.), or a combination of both that issuitable for accessing encoded video data stored on file server 114.File server 114 and input interface 122 may be configured to operateaccording to a streaming transmission protocol, a download transmissionprotocol, or a combination thereof.

Output interface 108 and input interface 122 may represent wirelesstransmitters/receiver, modems, wired networking components (e.g.,Ethernet cards), wireless communication components that operateaccording to any of a variety of IEEE 802.11 standards, or otherphysical components. In examples where output interface 108 and inputinterface 122 comprise wireless components, output interface 108 andinput interface 122 may be configured to transfer data, such as encodedvideo data, according to a cellular communication standard, such as 4G,4G-LTE (Long-Term Evolution), LTE Advanced, 5G, or the like. In someexamples where output interface 108 comprises a wireless transmitter,output interface 108 and input interface 122 may be configured totransfer data, such as encoded video data, according to other wirelessstandards, such as an IEEE 802.11 specification, an IEEE 802.15specification (e.g., ZigBee™), a Bluetooth™ standard, or the like. Insome examples, source device 102 and/or destination device 116 mayinclude respective system-on-a-chip (SoC) devices. For example, sourcedevice 102 may include an SoC device to perform the functionalityattributed to video encoder 200 and/or output interface 108, anddestination device 116 may include an SoC device to perform thefunctionality attributed to video decoder 300 and/or input interface122.

The techniques of this disclosure may be applied to video coding insupport of any of a variety of multimedia applications, such asover-the-air television broadcasts, cable television transmissions,satellite television transmissions, Internet streaming videotransmissions, such as dynamic adaptive streaming over HTTP (DASH),digital video that is encoded onto a data storage medium, decoding ofdigital video stored on a data storage medium, or other applications.

Input interface 122 of destination device 116 receives an encoded videobitstream from computer-readable medium 110 (e.g., a communicationmedium, storage device 112, file server 114, or the like). The encodedvideo bitstream may include signaling information defined by videoencoder 200, which is also used by video decoder 300, such as syntaxelements having values that describe characteristics and/or processingof video blocks or other coded units (e.g., slices, pictures, groups ofpictures, sequences, or the like). Display device 118 displays decodedpictures of the decoded video data to a user. Display device 118 mayrepresent any of a variety of display devices such as a cathode ray tube(CRT), a liquid crystal display (LCD), a plasma display, an organiclight emitting diode (OLED) display, or another type of display device.

Although not shown in FIG. 1 , in some examples, video encoder 200 andvideo decoder 300 may each be integrated with an audio encoder and/oraudio decoder, and may include appropriate MUX-DEMUX units, or otherhardware and/or software, to handle multiplexed streams including bothaudio and video in a common data stream. If applicable, MUX-DEMUX unitsmay conform to the ITU H.223 multiplexer protocol, or other protocolssuch as the user datagram protocol (UDP).

Video encoder 200 and video decoder 300 each may be implemented as anyof a variety of suitable encoder and/or decoder circuitry, such as oneor more microprocessors, digital signal processors (DSPs), applicationspecific integrated circuits (ASICs), field programmable gate arrays(FPGAs), discrete logic, software, hardware, firmware or anycombinations thereof. When the techniques are implemented partially insoftware, a device may store instructions for the software in asuitable, non-transitory computer-readable medium and execute theinstructions in hardware using one or more processors to perform thetechniques of this disclosure. Each of video encoder 200 and videodecoder 300 may be included in one or more encoders or decoders, eitherof which may be integrated as part of a combined encoder/decoder (CODEC)in a respective device. A device including video encoder 200 and/orvideo decoder 300 may comprise an integrated circuit, a microprocessor,and/or a wireless communication device, such as a cellular telephone.

Video encoder 200 and video decoder 300 may operate according to a videocoding standard, such as ITU-T H.265, also referred to as HighEfficiency Video Coding (HEVC) or extensions thereto, such as themulti-view and/or scalable video coding extensions. Alternatively, videoencoder 200 and video decoder 300 may operate according to otherproprietary or industry standards, such as the Joint Exploration TestModel (JEM) or ITU-T H.266, also referred to as Versatile Video Coding(VVC). A recent draft of the VVC standard is described in Bross, et al.“Versatile Video Coding (Draft 4),” Joint Video Experts Team (JVET) ofITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 13^(th) Meeting:Marrakech, MA, 9-18 Jan. 2019, JVET-M1001-v2 (hereinafter 437 VVC Draft4). The techniques of this disclosure, however, are not limited to anyparticular coding standard.

In general, video encoder 200 and video decoder 300 may performblock-based coding of pictures. The term “block” generally refers to astructure including data to be processed (e.g., encoded, decoded, orotherwise used in the encoding and/or decoding process). For example, ablock may include a two-dimensional matrix of samples of luminanceand/or chrominance data. In general, video encoder 200 and video decoder300 may code video data represented in a YUV (e.g., Y, Cb, Cr) format.That is, rather than coding red, green, and blue (RGB) data for samplesof a picture, video encoder 200 and video decoder 300 may code luminanceand chrominance components, where the chrominance components may includeboth red hue and blue hue chrominance components. In some examples,video encoder 200 converts received RGB formatted data to a YUVrepresentation prior to encoding, and video decoder 300 converts the YUVrepresentation to the RGB format. Alternatively, pre- andpost-processing units (not shown) may perform these conversions.

This disclosure may generally refer to coding (e.g., encoding anddecoding) of pictures to include the process of encoding or decodingdata of the picture. Similarly, this disclosure may refer to coding ofblocks of a picture to include the process of encoding or decoding datafor the blocks, e.g., prediction and/or residual coding. An encodedvideo bitstream generally includes a series of values for syntaxelements representative of coding decisions (e.g., coding modes) andpartitioning of pictures into blocks. Thus, references to coding apicture or a block should generally be understood as coding values forsyntax elements forming the picture or block.

HEVC defines various blocks, including coding units (CUs), predictionunits (PUs), and transform units (TUs). According to HEVC, a video coder(such as video encoder 200) partitions a coding tree unit (CTU) into CUsaccording to a quadtree structure. That is, the video coder partitionsCTUs and CUs into four equal, non-overlapping squares, and each node ofthe quadtree has either zero or four child nodes. Nodes without childnodes may be referred to as “leaf nodes,” and CUs of such leaf nodes mayinclude one or more PUs and/or one or more TUs. The video coder mayfurther partition PUs and TUs. For example, in HEVC, a residual quadtree(RQT) represents partitioning of TUs. In HEVC, PUs representinter-prediction data, while TUs represent residual data. CUs that areintra-predicted include intra-prediction information, such as anintra-mode indication. This disclosure may generally use the termpartition to represent a block as a TU, PU, or CU, or to represent aplurality of blocks or region that includes multiple blocks.

As another example, video encoder 200 and video decoder 300 may beconfigured to operate according to JEM or VVC. According to JEM or VVC,a video coder (such as video encoder 200) partitions a picture into aplurality of coding tree units (CTUs). Video encoder 200 may partition aCTU according to a tree structure, such as a quadtree-binary tree (QTBT)structure or Multi-Type Tree (MTT) structure. The QTBT structure removesthe concepts of multiple partition types, such as the separation betweenCUs, PUs, and TUs of HEVC. A QTBT structure includes two levels: a firstlevel partitioned according to quadtree partitioning, and a second levelpartitioned according to binary tree partitioning. A root node of theQTBT structure corresponds to a CTU. Leaf nodes of the binary treescorrespond to coding units (CUs).

In an MTT partitioning structure, blocks may be partitioned using aquadtree (QT) partition, a binary tree (BT) partition, and one or moretypes of triple tree (TT) partitions. A triple tree partition is apartition where a block is split into three sub-blocks. In someexamples, a triple tree partition divides a block into three sub-blockswithout dividing the original block through the center. The partitioningtypes in MTT (e.g., QT, BT, and TT), may be symmetrical or asymmetrical.

In some examples, video encoder 200 and video decoder 300 may use asingle QTBT or MTT structure to represent each of the luminance andchrominance components, while in other examples, video encoder 200 andvideo decoder 300 may use two or more QTBT or MTT structures, such asone QTBT/MTT structure for the luminance component and another QTBT/MTTstructure for both chrominance components (or two QTBT/MTT structuresfor respective chrominance components).

Video encoder 200 and video decoder 300 may be configured to usequadtree partitioning per HEVC, QTBT partitioning, MTT partitioning, orother partitioning structures. For purposes of explanation, thedescription of the techniques of this disclosure is presented withrespect to QTBT partitioning. However, it should be understood that thetechniques of this disclosure may also be applied to video codersconfigured to use quadtree partitioning, or other types of partitioningas well.

This disclosure may use “N×N” and “N by N” interchangeably to refer tothe sample dimensions of a block (such as a CU or other video block) interms of vertical and horizontal dimensions, e.g., 16×16 samples or 16by 16 samples. In general, a 16×16 CU will have 16 samples in a verticaldirection (y=16) and 16 samples in a horizontal direction (x=16).Likewise, an N×N CU generally has N samples in a vertical direction andN samples in a horizontal direction, where N represents a nonnegativeinteger value. The samples in a CU may be arranged in rows and columns.Moreover, CUs need not necessarily have the same number of samples inthe horizontal direction as in the vertical direction. For example, CUsmay comprise N×M samples, where M is not necessarily equal to N.

Video encoder 200 encodes video data for CUs representing predictionand/or residual information, and other information. The predictioninformation indicates how the CU is to be predicted in order to form aprediction block for the CU. The residual information generallyrepresents sample-by-sample differences between samples of the CU priorto encoding and the prediction block.

To predict a CU, video encoder 200 may generally form a prediction blockfor the CU through inter-prediction or intra-prediction.Inter-prediction generally refers to predicting the CU from data of apreviously coded picture, whereas intra-prediction generally refers topredicting the CU from previously coded data of the same picture. Toperform inter-prediction, video encoder 200 may generate the predictionblock using one or more motion vectors. Video encoder 200 may generallyperform a motion search to identify a reference block that closelymatches the CU, e.g., in terms of differences between the CU and thereference block. Video encoder 200 may calculate a difference metricusing a sum of absolute difference (SAD), sum of squared differences(SSD), mean absolute difference (MAD), mean squared differences (MSD),or other such difference calculations to determine whether a referenceblock closely matches the current CU. In some examples, video encoder200 may predict the current CU using uni-directional prediction orbi-directional prediction.

Some examples of JEM and VVC also provide an affine motion compensationmode, which may be considered an inter-prediction mode. In affine motioncompensation mode, video encoder 200 may determine two or more motionvectors that represent non-translational motion, such as zoom in or out,rotation, perspective motion, or other irregular motion types.

To perform intra-prediction, video encoder 200 may select anintra-prediction mode to generate the prediction block. Some examples ofJEM and VVC provide sixty-seven intra-prediction modes, includingvarious directional modes, as well as planar mode and DC mode. Ingeneral, video encoder 200 selects an intra-prediction mode thatdescribes neighboring samples to a current block (e.g., a block of a CU)from which to predict samples of the current block. Such samples maygenerally be above, above and to the left, or to the left of the currentblock in the same picture as the current block, assuming video encoder200 codes CTUs and CUs in raster scan order (left to right, top tobottom).

Video encoder 200 encodes data representing the prediction mode for acurrent block. For example, for inter-prediction modes, video encoder200 may encode data representing which of the various availableinter-prediction modes is used, as well as motion information for thecorresponding mode. For uni-directional or bi-directionalinter-prediction, for example, video encoder 200 may encode motionvectors using advanced motion vector prediction (AMVP) or merge mode.Video encoder 200 may use similar modes to encode motion vectors foraffine motion compensation mode.

Following prediction, such as intra-prediction or inter-prediction of ablock, video encoder 200 may calculate residual data for the block. Theresidual data, such as a residual block, represents sample by sampledifferences between the block and a prediction block for the block,formed using the corresponding prediction mode. Video encoder 200 mayapply one or more transforms to the residual block, to producetransformed data in a transform domain instead of the sample domain. Forexample, video encoder 200 may apply a discrete cosine transform (DCT),an integer transform, a wavelet transform, or a conceptually similartransform to residual video data. Additionally, video encoder 200 mayapply a secondary transform following the first transform, such as amode-dependent non-separable secondary transform (MDNSST), a signaldependent transform, a Karhunen-Loeve transform (KLT), or the like.Video encoder 200 produces transform coefficients following applicationof the one or more transforms.

As noted above, following any transforms to produce transformcoefficients, video encoder 200 may perform quantization of thetransform coefficients. Quantization generally refers to a process inwhich transform coefficients are quantized to possibly reduce the amountof data used to represent the transform coefficients, providing furthercompression. By performing the quantization process, video encoder 200may reduce the bit depth associated with some or all of the transformcoefficients. For example, video encoder 200 may round an n-bit valuedown to an m-bit value during quantization, where n is greater than m.In some examples, to perform quantization, video encoder 200 may performa bitwise right-shift of the value to be quantized.

Following quantization, video encoder 200 may scan the transformcoefficients, producing a one-dimensional vector from thetwo-dimensional matrix including the quantized transform coefficients.The scan may be designed to place higher energy (and therefore lowerfrequency) transform coefficients at the front of the vector and toplace lower energy (and therefore higher frequency) transformcoefficients at the back of the vector. In some examples, video encoder200 may utilize a predefined scan order to scan the quantized transformcoefficients to produce a serialized vector, and then entropy encode thequantized transform coefficients of the vector. In other examples, videoencoder 200 may perform an adaptive scan. After scanning the quantizedtransform coefficients to form the one-dimensional vector, video encoder200 may entropy encode the one-dimensional vector, e.g., according tocontext-adaptive binary arithmetic coding (CABAC). Video encoder 200 mayalso entropy encode values for syntax elements describing metadataassociated with the encoded video data for use by video decoder 300 indecoding the video data.

To perform CABAC, video encoder 200 may assign a context within acontext model to a symbol to be transmitted. The context may relate to,for example, whether neighboring values of the symbol are zero-valued ornot. The probability determination may be based on a context assigned tothe symbol.

Video encoder 200 may further generate syntax data, such as block-basedsyntax data, picture-based syntax data, and sequence-based syntax data,to video decoder 300, e.g., in a picture header, a block header, a sliceheader, or other syntax data, such as a sequence parameter set (SPS),picture parameter set (PPS), or video parameter set (VPS). Video decoder300 may likewise decode such syntax data to determine how to decodecorresponding video data.

In this manner, video encoder 200 may generate a bitstream includingencoded video data, e.g., syntax elements describing partitioning of apicture into blocks (e.g., CUs) and prediction and/or residualinformation for the blocks. Ultimately, video decoder 300 may receivethe bitstream and decode the encoded video data.

In general, video decoder 300 performs a reciprocal process to thatperformed by video encoder 200 to decode the encoded video data of thebitstream. For example, video decoder 300 may decode values for syntaxelements of the bitstream using CABAC in a manner substantially similarto, albeit reciprocal to, the CABAC encoding process of video encoder200. The syntax elements may define partitioning information forpartitioning a picture into CTUs, and partitioning of each CTU accordingto a corresponding partition structure, such as a QTBT structure, todefine CUs of the CTU. The syntax elements may further define predictionand residual information for blocks (e.g., CUs) of video data.

The residual information may be represented by, for example, quantizedtransform coefficients. Video decoder 300 may inverse quantize andinverse transform the quantized transform coefficients of a block toreproduce a residual block for the block. Video decoder 300 uses asignaled prediction mode (intra- or inter-prediction) and relatedprediction information (e.g., motion information for inter-prediction)to form a prediction block for the block. Video decoder 300 may thencombine the prediction block and the residual block (on asample-by-sample basis) to reproduce the original block. Video decoder300 may perform additional processing, such as performing a deblockingprocess to reduce visual artifacts along boundaries of the block.

In accordance with the techniques of this disclosure, video encoder 200and/or video decoder 300 may perform generalized intra prediction. Forinstance, video encoder 200 and/or video decoder 300 may code anindication of the performance of generalized prediction for a currentcoding block of video data; split, based on the indication, the currentcoding block into one or more prediction blocks; define, based on theone or more prediction blocks, one or more prediction block groups(PBGs); determine a coding order of the PBGs; and code the PBGs in thecoding order.

In some examples, a video coder (e.g., video encoder 200 and/or videodecoder 300) may perform splitting of a coding unit (CU) of video datausing intra sub-partition (ISP) to form a set of prediction blocks. Forexample, the video coder may perform vertical splitting or horizontalsplitting of the CU. In another example, the video coder may performsplitting of the CU as a combination of vertical splitting andhorizontal splitting of the CU. For example, the video coder may performvertical splitting on parts of the CU and perform horizontal splittingon other parts of the CU. In any case, the video coder may split the CU,via various splitting or partitioning techniques, to form a set ofprediction blocks.

In some examples, the prediction blocks may include at least a firstprediction block and a second prediction block. In some examples, a CUmay include a single prediction block. The prediction blocks may includenarrow vertical blocks, narrow horizontal blocks, or a combination ofvertical blocks and horizontal blocks. It should be noted that the term“narrow” in this context generally includes non-square blocks (e.g.,rectangular blocks, etc.).

In addition, the video coder may group a plurality of the set ofprediction blocks into a first prediction block group (PBG). In someexamples, a PBG may include at least two prediction blocks. For example,a PBG may include one or more vertical prediction blocks, one or morehorizontal prediction blocks, or one or more combinations of verticalprediction blocks and horizontal prediction blocks. In another example,a PBG may include a single prediction block. In addition, a CU mayinclude PBGs of varying sizes. For example, a single CU may include aplurality of PBGs with a first PBG formed by grouping one or moreprediction blocks (e.g., vertical, horizontal, or both vertical andhorizontal blocks), and a second PBG formed by grouping at least twoprediction blocks (e.g., vertical, horizontal, or both vertical andhorizontal blocks). In some examples, the single CU may additionallyinclude a third PBG formed by grouping at least two prediction blocks(e.g., vertical blocks, horizontal blocks, or both vertical andhorizontal blocks), a fourth PBG formed by grouping at least threeprediction blocks (e.g., vertical, horizontal, or both), etc.

In addition, one or more subsequent CUs may include the same or adifferent arrangement of prediction blocks and/or PBGs. For example, asecond CU may include the same or at least a similar partitioning tothat of one or more preceding CUs. In another example, the second CU mayinclude a different partitioning scheme to that of the one or morepreceding CUs. For example, the video coder may partition a first CUusing vertical splitting, the second CU using horizontal splitting, athird CU using a combination of horizontal and vertical splitting, afourth CU using a different type of horizontal splitting (e.g., to formmore or less narrow prediction blocks relative to other partitionedprediction blocks), or any other combinations or variations.

While this disclosure describes certain examples in terms of verticalsplitting or in some instances, horizontal splitting, it will beunderstood that the techniques of this disclosure are not so limited, asdescribed. In an additional non-limiting example and for illustrationpurposes, the video decoder may partition a plurality of CUs using thesame partitioning scheme, such as by using vertical splitting topartition at least two CUs. The video coder may additionally group thepartitioned prediction blocks, for each partitioned CU, differently orthe same. For example, the video coder may perform vertical splitting ona first CU to form a group of at least three PBGs, whereas the videocoder may perform a similar vertical splitting on one or more subsequentCUs, but to form at least two PBGs. That is, the quantity of PBGs foreach CU may not necessarily be equal from one partitioned CU to a nextpartitioned CU, even in cases where the video coder partitions each CUusing, for example, vertical splitting. It will be understood that, forsake of brevity, not all options may be explicitly delineated, but thatother splitting and/or grouping options may be achieved by one or moreof the various techniques, or variations of the techniques, described inthis disclosure. In addition, a video coder may perform an independentreconstruction of such PBGs, as described above and as further describedbelow.

The video coder may then reconstruct samples of prediction blocksincluded in the first PBG independently of samples of other predictionblocks included in the first PBG. For example, the video coder mayreconstruct the samples as part of a decoding loop. In other instances,video decoder 300 may split a CU to form a set of prediction blocks andgroup the prediction blocks into PBGs.

This disclosure may generally refer to “signaling” certain information,such as syntax elements. The term “signaling” may generally refer to thecommunication of values for syntax elements and/or other data used todecode encoded video data. That is, video encoder 200 may signal valuesfor syntax elements in the bitstream. In general, signaling refers togenerating a value in the bitstream. As noted above, source device 102may transport the bitstream to destination device 116 substantially inreal time, or not in real time, such as might occur when storing syntaxelements to storage device 112 for later retrieval by destination device116.

FIG. 2 is a diagram illustrating example intra prediction modes. Anexample of intra prediction is wide-angle intra prediction. In someexamples, the intra prediction of a luma block includes 35 modes,including the Planar prediction mode, DC prediction mode and 33 angular(or directional) prediction modes. For example, directional predictionfor square blocks, in some instances, may use directions between −135degrees to 45 degrees of a current block 250, such as in the VVC testmodel 2 (VTM2) (see “Algorithm description for Versatile Video Codingand Test Model 2 (VTM2),” 11th JVET Meeting, Ljubljana, SI, July 2018,JVET-K1002 by J. Chen, Y. Ye, and S. Kim), as illustrated in FIG. 2 .The 35 modes of the intra prediction are indexed as shown in Table 1below.

TABLE 1 Specification of intra prediction mode and associated namesIntra prediction mode Associated name 0 INTRA_PLANAR 1 INTRA_DC 2 . . .34 INTRA_ANGULAR2 . . . INTRA_ANGULAR34

In VTM2, the block structure used for specifying the prediction blockfor intra prediction is not restricted to be square (width w=height h).For example, rectangular prediction blocks (w>h or w<h) can increase thecoding efficiency based on the characteristics of the content.

In such rectangular blocks, restricting the direction of intraprediction to be within −135 degrees to 45 degrees can result insituations where farther reference samples are used rather than closerreference samples for intra prediction. Such a design is likely to havean impact on the coding efficiency. It may be more beneficial to havethe range of restrictions relaxed so that closer reference samples(beyond the −135 to 45-degree angle) may be used for prediction. Anexample of such a case is given in FIG. 3 .

FIG. 3 is a block diagram illustrating an example of an 8×4 rectangularblock. In the example of FIG. 3 , “closer” reference samples (indicatedby circle 3002) are not used, but “farther” reference samples (indicatedby circle 3006) may be used, due to restriction of intra predictiondirection to be in the range of −135 degrees to 45 degrees. That is, inthe example of FIG. 3 , some reference samples within −135 degrees butfarther than those indicated by circle 3002 may be used, while somereference samples are not used although they are closer than othersamples, e.g., closer than the samples indicated by circle 3006.

FIGS. 4A-4C are block diagrams providing illustrations of mode mappingprocess(es) for modes outside the diagonal direction range. In theexample of FIG. 4A, a square block CU 402 does not require angular moderemapping. FIG. 4B illustrates angular mode remapping for a horizontalnon-square block CU 404. FIG. 4C illustrates angular mode remapping fora vertical non-square block CU 406.

FIG. 5 is a block diagram illustrating wide-angle intra-predictionadopted in VTM2. In some examples, wide-angle intra-prediction includesangular intra prediction for square and non-square blocks, such as block550.

During the 12th JVET meeting, a modification of wide-angle intraprediction was adopted into VVC test model 3 (VTM3), details of whichare available from (i) “CE3-related: Unification of angular intraprediction for square and non-square blocks,” 12th JVET Meeting, MacauSAR, CN, October 2018, JVET-L0279 by L. Zhao, X. Zhao, S. Liu, and X.Li; (ii) “Algorithm description for Versatile Video Coding and TestModel 3 (VTM3),” 12th JVET Meeting, Macau SAR, CN, October 2018,JVET-L1002 by J. Chen, Y. Ye, and S. Kim; and (iii) “Versatile VideoCoding (Draft 3),” 12th JVET Meeting, Macau SAR, CN, October 2018,JVET-L1001 by B. Bross, J. Chen, and S. Liu.

This adoption includes two modifications to unify the angular intraprediction for square and non-square blocks. First, angular predictiondirections may be modified to cover diagonal directions of all blockshapes. Second, angular directions may be kept within the range betweenthe bottom-left diagonal direction and the top-right diagonal directionfor all block aspect ratios (square and non-square) as illustrated inFIGS. 4A-4C and described above. In addition, the number of referencesamples in the top reference row and left reference column arerestricted to 2*width+1 and 2*height+1 for all block shapes. In theexample of FIG. 5 , wide angles (−1 to −10, and 67 to 76) are depictedin addition to the 65 angular modes.

FIG. 6 is a block diagram illustrating wider angles for prediction ofblocks 650 that are adopted in VTM3. Although VTM3 defines 95 modes forany block size, only 67 modes may be allowed. The exact modes that areallowed depend on the ratio of block width to block height. This is doneby restricting the mode range for certain blocks sizes. FIG. 6 providesan illustration of wide angles (−1 to −14, and 67 to 80) in VTM3 beyondmodes 2 and 66 for a total of 93 angular modes.

FIG. 7 is an example mapping table that provides an example mappingbetween predModeIntra and the angle parameter intraPredAngle in VTM3,further details of which are available from “Versatile Video Coding(Draft 3),” 12th JVET Meeting, Macau SAR, CN, October 2018, JVET-L1001by B. Bross, J. Chen, and S. Liu. In the following, angular modes with apositive intraPredAngle value are referred to as positive angular modes(mode index <18 or >50), while angular modes with a negativeintraPredAngle value are referred to as negative angular modes (modeindex >18 and <50).

The inverse angle parameter invAngle is derived based on intraPredAngleas follows in Eq. (1):

$\begin{matrix}{{invAngle} = {{Round}\left( \frac{256*32}{intraPredAngle} \right)}} & (1)\end{matrix}$

Note that intraPredAngle values that are multiples of 32 (i.e. values of0, 32, 64, 128, 256, 512 in this example) correspond with predictionfrom non-fractional reference array samples, as is the case in the VTM3specification. Table 2 below illustrates diagonal modes correspondingwith various block aspect ratios.

TABLE 2 Diagonal modes corresponding with various block aspect ratios.Block aspect ratio (W/H) Diagonal modes 1 (square)  2, 34, 66 2  8, 28,72 4 12, 24, 76 8 14, 22, 78 16  16, 20, 80 ½ −6, 40, 60 ¼ −10, 44, 56 ⅛ −12, 46, 54  1/16 −14, 48, 52 

Intra sub-partition coding is described in the following portions ofthis disclosure. Intra sub-partition coding (ISP) (S. De LuxánHernández, H. Schwarz, D. Marpe, T. Wiegand (HHI) “CE3: Line-based intracoding mode”, JVET-L0076) is a method by which a coding block (e.g., acoding unit) is split or partitioned into multiple subblocks. Forexample, a coding unit may be split or partitioned into two subblocks,four subblocks, etc. In some instances, a subblock may be referred to asa prediction block. In some examples, each subblock within a codingblock is reconstructed in decoding order before the reconstruction ofthe subsequent subblock in decoding order. In the fourth working draftof VVC (hereinafter “VVC WD4” and available atphenix.it-sudparis.eu/jvet/doc_end_user/documents/13_Marrakech/wg11/JVET-M1001-v2.zip),ISP may only be applied to luma coding blocks. In such examples, thereference samples for these ISP-coded blocks are restricted to be fromthe reference line that is closest to the coding block (refer MRLIdx=0as discussed below with reference to multiple reference lineprediction).

FIGS. 8 and 9 are block diagrams illustrating example divisions ofblocks for ISP coding. FIG. 8 illustrates an example of division of 4×8and 8×4 blocks. FIG. 9 illustrates an example of division of all blocksexcept 4×8, 8×4, and 4×4 blocks.

One bit is used to signal whether a coding block is split into intrasub-partitions, and a second bit is used to indicate the split type.Example split types include horizontal splitting, vertical splitting,vertical and horizontal splitting, etc. Based on the intra mode and thesplit type used, two different classes of processing orders may be used,which are referred to as normal order and reverse processing order. Inthe normal order, the first sub-partition to be processed is the onecontaining the top-left sample of the CU and then continuing downwards(horizontal split) or rightwards (vertical split). On the other hand,the reverse processing order either starts with the sub-partitioncontaining the bottom-left sample of the CU and continues upwards(horizontal split) or starts with the sub-partition containing thetop-right sample of the CU and continues leftwards (vertical split).

In some examples, a variation of ISP may use only the normal processingorder, such as in VVC WD4. It is to be noted that the terms “subblock”and “sub-partitions” may be used interchangeably herein. In any case,both terms refer to the blocks obtained by partitioning a coding blockusing ISP.

Some syntax (e.g., Tables 3 and 4) and semantics associated with ISP inVVC WD4 are shown below, with italics to indicate relevant syntax:

TABLE 3 Syntax table of coding unit Descriptor coding_unit( x0, y0,cbWidth, cbHeight, treeType ) { ... } else { if( treeType = =SINGLE_TREE | | treeType = = DUAL_TREE_LUMA ) { if( ( y0 % CtbSizeY ) >0 ) intra_luma_ref_idx[ x0 ][ y0 ] ae(v) if (intra _(—) luma _(—) ref_(—) idx[ x0 ][ y0 ] = = 0 && ( cbWidth < = MaxTbSizeY | | cbHeight < =MaxTbSizeY ) && ( cbWidth * cbHeight > MinTbSizeY * MinTbSizeY ))intra_subpartitions_mode_flag[ x0 ][ y0 ] ae(v) if(intra_subpartitions_mode_flag[ x0 ][ y0 ] = = 1 && cbWidth <= MaxTbSizeY&& cbHeight <= MaxTbSizeY ) intra_subpartitions_split_flag[ x0 ][ y0 ]ae(v) if( intra_luma_ref_idx[ x0 ][ y0 ] = = 0 &&intra_subpartitions_mode_flag[ x0 ][ y0 ] = = 0 ) intra_luma_mpm_flag[x0 ][ y0 ] ae(v) if( intra_luma_mpm_flag[ x0 ][ y0 ] )intra_luma_mpm_idx[ x0 ][ y0 ] ae(v) ...

TABLE 4 Syntax table of transform tree Descriptor transform_tree( x0,y0, tbWidth, tbHeight, treeType) { InferTuCbfLuma = 1 if(IntraSubPartSplitType = = NO _(—) ISP _(—) SPLIT) { if( tbWidth >MaxTbSizeY | | tbHeight > MaxTbSizeY ) { trafoWidth = ( tbWidth >MaxTbSizeY ) ? (tbWidth / 2) : tbWidth trafoHeight = ( tbHeight >MaxTbSizeY ) ? (tbHeight / 2) : tbHeight transform_tree( x0, y0,trafoWidth, trafoHeight ) if( tbWidth > MaxTbSizeY ) transform_tree(x0 + trafoWidth, y0, trafoWidth, trafoHeight, treeType ) if( tbHeight >MaxTbSizeY ) transform_tree( x0, y0 + trafoHeight, trafoWidth,trafoHeight, treeType ) if( tbWidth > MaxTbSizeY && tbHeight >MaxTbSizeY ) transform_tree( x0 + trafoWidth, y0 + trafoHeight,trafoWidth, trafoHeight, treeType ) } else { transform_unit( x0, y0,tbWidth, tbHeight, treeType, 0 ) } } else if(IntraSubPartitionsSplitType = = ISP _(—) HOR _(—) SPLIT) { trafoHeight =tbHeight / NumIntraSubPartitions for( partIdx = 0; partIdx <NumIntraSubPartitions; partIdx+ + ) transform_unit( x0, y0 +trafoHeight * partIdx, tbWidth, trafoHeight, treeType, partI dx ) } elseif( IntraSubPartitionsSplitType = = ISP _(—) VER _(—) SPLIT) {trafoWidth = tbWidth / NumIntraSubPartitions for( partIdx = 0; partIdx <NumIntraSubPartitions; partIdx+ + ) transform_unit( x0 + trafoWidth *partIdx, y0, trafoWidth, tbHeight, treeType, partIdx) } }

Semantics of Coding Unit

intra_subpartitions_mode_flag[x0][y0] equal to 1 specifies that thecurrent intra coding unit is partitioned intoNumIntraSubPartitions[x0][y0] rectangular transform blocksub-partitions. intra_subpartitions_mode_flag[x0][y0] equal to 0specifies that the current intra coding unit is not partitioned intorectangular transform block sub-partitions.

When intra_subpartitions_mode_flag[x0][y0] is not present, the value ofintra_subpartitions_mode_flag is inferred to be equal to 0.

In some examples, an intra sub-partitions split flag may be used toindicate a split type or the split types. For example, aintra_subpartitions_split_flag[x0][y0] may specify a split types or anumber of various split types. In some examples, the intrasub-partitions split flag may indicate whether the intra sub-partitionssplit type is horizontal, vertical, rectangular, a mix of vertical andhorizontal, or any other split type. Whenintra_subpartitions_mode_flag[x0][y0] is not present, the video codermay infer the flag to be equal to 0.

The variable IntraSubPartitionsSplitType specifies the type of splitused for the current luma coding block as illustrated in Table 5 below.IntraSubPartitionsSplitType is derived as follows:

If intra_subpartitions_mode_flag[x0][y0] is equal to 0,IntraSubPartitionsSplitType is set equal to 0.

Otherwise, the IntraSubPartitionsSplitType is set equal to1+intra_subpartitions_split_flag[x0][y0].

TABLE 5 Name association to IntraSubPartitionsSplitTypeIntraSubPartitionsSplitType Name of IntraSubPartitionsSplitType 0ISP_NO_SPLIT 1 ISP_HOR_SPLIT 2 ISP_VER_SPLIT

The variable NumIntraSubPartitions specifies the number of transformblock sub-partitions into which an intra luma coding block is divided.NumIntraSubPartitions is derived as follows:

If IntraSubPartitionsSplitType is equal to ISP_NO_SPLIT,NumIntraSubPartitions is set equal to 1.

Otherwise, if one of the following conditions is true,NumIntraSubPartitions is set equal to 2: (1) cbWidth is equal to 4 andcbHeight is equal to 8, or (2) cbWidth is equal to 8 and cbHeight isequal to 4. Otherwise, NumIntraSubPartitions is set equal to 4.

Multiple reference line prediction will be described in the followingportions of this disclosure. The samples in the neighbourhood of acoding block are used for intra prediction of the block. Typically, thereconstructed reference sample lines that are closest to the left andthe top boundaries of the coding block are used as the reference samplesfor intra prediction. In some examples, such as in VVC WD4, a videocoder may use other samples as references samples, such as samples inthe neighbourhood of the coding block or CU.

FIG. 10 is a block diagram illustrating the reference sample lines thatmay be used for intra prediction. For each coding block, an index issignalled that indicates the reference line that is used.

In some examples, such as in VVC WD4, a video coder may only usereference lines with MRLIdx equal to 0, 1 or 3. A video coder may codethe index to the reference line used for coding the block (e.g., values0, 1 and 2 indicating lines with MRLIdx 0, 1 and 3, respectively) withtruncated unary codeword. Planar and DC modes are not used for thereference line used has MRLIdx>0.

In some examples, the intra sub-partition (ISP) coding of blocks (e.g.,coding units (CUs)) may result in subblocks (e.g., prediction blocks)that have very small size in one or more dimensions. For example, when avideo coder codes a coding block of size 4×N (W×H) using ISP coding, andvertical splitting is used, the resultant subblocks have size 1×N when Nis greater than 8. In implementations that rely on row-wise storage ofsamples, increasingly narrow subblocks may result in memory access ofsamples that are more than the number of samples in the subblock. When ablock is of size M×N, accessing the M samples in each row from memorymay often involve an overhead, such as accessing more than M samples dueto a limitation on the minimum number of samples per access.

Due to the narrow shape of 1×N and 2×N blocks, this problem may beexacerbated as the number of rows is larger for narrow blocks. Forexample, a 4×4 block may only have four rows. A 1×16 subblock, althoughhaving the same number of samples as the 4×4 block, has 16 rows to beaccessed. The resultant overhead may have issues in delay criticalcases, particularly in the worst case where there are several smallcoding blocks in high resolution sequences. As the reconstruction of asubblock in the coding block may be dependent on the reconstruction ofother subblocks in the coding block that precede in decoding order,these subblocks are likely to increase the delay for decoding the block,which may be undesirable.

Some solutions for addressing the problem have been proposed. However,each of the proposed solutions involves drawbacks that may result incoding inefficiencies. One proposed solution is to disable the verticalsplitting of 4×N and 8×N blocks that result in blocks of size 1×N and2×N, respectively. However, this proposed solution may reduce theefficiency of ISP. Another proposed solution (i.e., in S. De LuxánHernández, V. George, J. Ma, T. Nguyen, H. Schwarz, D. Marpe, T. Wiegand(HHI), “CE3: Intra Sub-Partitions Coding Mode,” JVET-M0102) is torestrict the prediction of 1×N subblocks to VER prediction mode suchthat they are not dependent on the reconstruction of other subblocks inthe coding block. However, this proposed solution also results inlimiting the intra modes allowed for the coding block, which results incoding inefficiency.

This disclosure proposes techniques to improve the design ofsub-partition coding of CUs. One or more of the techniques disclosedhere may be implemented separately or implemented together in anycombination.

According to a first proposed technique, a video coder (e.g., videoencoder 200 and/or video decoder 300) may perform generalized predictionof subblocks (e.g., prediction blocks). As one example, the video codermay code an indication (e.g., a syntax element) to perform generalizedprediction for a coding block. In some examples, the indication toperform generalized prediction may involve an indication that the codingblock is coded with ISP. In some examples, the indication to performgeneralized prediction may apply to only blocks with certain properties,e.g., when the block width is within a threshold range, or when theblock height is within a threshold range, or when the number of samplesin the block is within a threshold range, or when the sum of the blockwidth and block height is within a threshold range, or when the minimumor maximum of the block width and block height is within a thresholdrange. In some examples, for a block, a subset of intra modes may bedefined from the set of allowed intra modes for the block. In suchexamples, generalized prediction may only apply to the block when theblock is coded with an intra mode that belongs to the subset. In someexamples, the indication may be inferred by the blockproperties/characteristics, intra mode and properties of neighbouringblocks, or no explicit signalling may be used. For instance, when theblock width is less than a particular threshold, generalized predictionmay be applied.

As another example, based on the indication of performing generalizedprediction, the video coder may split a coding unit into one or moreprediction blocks.

As another example, the video coder may define one or more predictionblock groups (PBGs). A PBG may include one or more prediction blocks.The prediction of any sample in a PBG may not be directly dependent onthe predicted value of any other sample in the PBG. For example, a videocoder may reconstruct samples of prediction blocks included in a firstPBG independently of samples of other prediction blocks included in thesame PBG. In this way, the prediction of samples in the first PBG is notdependent on the predicted value of other samples in the first PBG.

In some examples, each PBG contains exactly one prediction block. Insome examples, each PBG may contain n_(PB) prediction blocks (e.g., aprediction block quantity). In such examples, the video coder maydetermine the value of n_(PB) based on the width and height of thecoding unit. In another example, the video coder may determine the valueof n_(PB) based on the number of samples of the coding unit or on thenumber of samples and the width and height of the coding unit. In someexamples, the video coder may also use the split type (e.g., horizontal,vertical, rectangular, a mix of horizontal and vertical, etc.) and/orthe intra mode (e.g., angular, etc.) to determine the value of n_(PB)for a coding unit or coding block. In some examples, a video coder mayspecify prediction blocks that are not rectangular in shape.

In some examples, the restriction that “the prediction of any sample ina PBG may not be dependent on the predicted value of any other sample inthe PBG” may be relaxed. For example, when certain filtering operationsare applied on predicted blocks that may result in a dependence betweenthe predicted value of samples within a PBG, the restriction ofprediction dependence of samples within a PBG may still be consideredvalid as long as the regular intra prediction (without thepost-prediction filtering operations) do not result in a dependence. Forinstance, the restriction may not apply to dependence due to positiondependent intra prediction combination (PDPC), boundary filtering orother similar operations. In some examples, each prediction block in acoding block may be contained in one and only one PBG. In some examples,this restriction may be stricter. For example, it may be restricted thateach prediction block in a coding block must be contained in one PBG,and only one PBG.

As another example, the video coder may specify a decoding order of thePBGs; the decoding order may be defined such that samples in a PBG P1 donot depend on the samples in a PBG P2 when P1 precedes P2 in decodingorder.

As another example, for each PBG, the video coder may specify a set ofavailable reference samples (described below).

As another example, the video coder may obtain a predicted value for asample in the PBG based on the available reference samples for the PBGand the intra mode used for prediction.

According to a second proposed technique, a video coder may performgeneralized planar prediction. As one example, the video coder maydefine a set of prediction blocks for a coding block In some examples,the video coder may code one or more of the prediction blocks with aPlanar mode. The video coder may perform the determination to split acoding block into planar prediction blocks by one or more of thefollowing:

1. An indication in the bitstream (e.g., a flag) that the block is to besplit into planar prediction blocks. In some examples, the coding blockmay be indicated to be coded by the Planar mode. In another example, theindication may be that the coding block is coded using ISP.

2. Planar prediction blocks may be indicated by a separate mode valueM_(PPB). When a block is indicated to be coded with M_(PPB), the blockis split into planar prediction blocks.

3. The determination may be inferred based on certain blockcharacteristics and other syntax elements in the bitstream. For example,a block with a certain maximum number of samples and coded with ISP maybe split into planar prediction blocks.

In some examples, the video coder may determine the number of planarprediction blocks, N_(PPB), and the splitting decisions, by one or moreof the following: N_(PPB) may be signalled in the bitstream (e.g., forall blocks or for blocks with certain characteristics) or derived basedon the block characteristics and other syntax elements in the bitstream.For example, the N_(PPB) may be set equal to the number of intrasub-partition blocks for a coding block. In other examples, the value ofN_(PPB) may be set equal to 4. The split direction or method may bedetermined by the block characteristics or derived from syntax elementsin the bitstream.

In some examples, the block may be split horizontally into N_(PPB)planar prediction blocks. The decision to split horizontally may, forexample, be determined by whether a horizontal split direction isindicated.

FIG. 11 is a conceptual diagram illustrating an example of how a codingblock may be split into four horizontal planar prediction blocks.

In some examples, the video coder may split the block vertically intoN_(PPB) planar prediction blocks. The video coder may determine whetherto split vertically based on whether a vertical split direction isindicated, for example.

FIG. 12 is a conceptual diagram illustrating an example of how a codingblock may be split into four vertical planar prediction blocks.

In some examples, the video coder may split the block horizontally andvertically resulting in a total of N_(PPB) planar prediction blocks. Thevideo coder may determine whether to split horizontally and verticallybased on one or more characteristics of the block (e.g., characteristicsof a CU). In some examples, the video coder may determine whether tosplit horizontally and vertically when no split direction is indicated.

In some examples, the block may be split into N_(PPB) rectangularregions, that share the same top-left sample as the coding block andeach rectangular region successively containing the one before it.Starting with the smallest block which is defined as the first planarprediction block, samples that belong to the next rectangular region butdo not belong to the previous rectangular region are said to constitutethe succeeding planar prediction block. FIG. 13 is a conceptual diagramillustrating an example of such a planar prediction block structure.

In some examples, the video coder may also predict a planar predictionblock by a method that is similar to the planar prediction mode.

In some examples, the video coder may define a coding order for theprediction blocks. For example, the coding order may be defined byconsidering the following (a. through d.):

a. Assign a representative sample for each prediction block; e.g.,top-left sample of a prediction block. In some cases (non-rectangularblocks), the top-left sample may not be defined, and the representativesample may be specified as the left-most sample that is in the top-mostrow of samples of the prediction block. It must be understood that thecoding order may be also specified with other definitions ofrepresentative samples.

b. Define a scan order of representative samples (e.g., raster scan,zigzag scan).

c. A prediction block that has a representative sample earlier in thescan order is coded earlier than a prediction block that has arepresentative sample later in the scan order.

d. Similar definition of coding order may also be specified forprediction block groups.

In some examples, the video coder may specify a planar prediction blockgroup (Planar PBG), including one or more planar prediction blocks,similar to the idea of PBG discussed above.

The boundaries between the planar prediction blocks may be smoothed byapplying filtering across the prediction block boundaries within thecoding block. One example of such filtering is lowpass filtering. Asecond example is the processing performed in PDPC mode. A third exampleis a deblocking filter process. These processes may be applied as asecond processing step after prediction or as a post-processing step.

It is to be understood that the above examples of planar predictionblocks are only some examples for illustrating the concept. The samplesin the coding block may be split into spatial regions that are definedas planar prediction blocks, and one or more methods disclosed in thisapplication may be applied to them.

Further, the term blocks (coding blocks, prediction blocks, etc.) asused in this document may apply to luma blocks or chroma blocks (orboth), or in general to coding blocks of any component of the video.

According to a third proposed technique, a video coder may determine theavailability of reference samples. For each PBG, the available referencesamples may be specified as follows:

In some examples, the set of available reference samples are the samefor all the prediction blocks in the coding block. There may be multipleimplementations of such a system. As one example, reference samplederivation process is unchanged, and changes are made to the predictionprocess. The set of available reference samples is derived once for thecoding block. In some examples, changes may be made to the predictionprocess to ensure that the reference samples may not be immediatelyadjacent to the prediction block, or as specified by MRLIdx relative tothe prediction block.

In some examples, the reference sample derivation is changed, and nochanges are made to the prediction process. In such examples, theavailable reference samples for a prediction block are derived from theset of available reference samples for the coding block. The derivationmay result in derived values of the reference samples for the predictionblock at positions that correspond with reference sample line used(i.e., based on the value of MRLIdx).

In some examples, the predicted samples from preceding PBG in decodingorder are marked as unavailable.

In some examples, the processed values of previously coded PBG of thesame coding block may be used for intra prediction of a PBG, in additionto the reference samples from the neighbouring coding blocks. As oneexample, the processed values refer to the reconstructed samples of thePBG. As another example, the processed values refer to the reconstructedsamples of the PBG before loop filtering/post-filtering operations. Asanother example, the processed values refer to the predicted samples ofthe PBG.

According to a fourth proposed technique, a video coder may performtransform coding. The prediction of samples in a block may be performedby one or more methods described herein. However, the techniques in thissection may also apply to coding blocks that are not predicted usinggeneralized prediction as described above.

In some examples, the video coder may apply a transform to the residualsamples in a PBG. A set of applicable shapes may be defined such thatthe video coder may apply a transform to all the samples of theapplicable shape. For example, the applicable shape may include squareand/or rectangular shapes. In some examples, transform coding of theblock may be performed by taking the entire coding block as one block.In such examples, only one transform coding may apply to the samples inthe coding block irrespective of the number of prediction blocks definedtherein. In some examples, the video coder may apply transform coding toa PBG when the shape of the PBG is an applicable shape on which thetransform may be applied. For example, if the samples in a PBG form arectangular region, the transform may be applied to the samples of thePBG. In some examples, the video coder may apply transform coding to oneor more subsets of samples in a PBG such that the samples in each suchsubset form an applicable shape. As one example, the subsets of the PBGmay be constrained to be mutually exclusive. As another example, thesubsets of the PBG collectively may be constrained to include all thesamples of the PBG.

In some examples, applying a transform may also include a transform-skipdecision on the block, whereby the sample values of the block areunchanged, an identity transform is applied, or a scaled identitytransform is applied. As one example, a video coder may code someprediction blocks in a PBG with transform skip. In addition, the videocoder may code some prediction blocks with other types of transform. Thedecision of applying a transform kernel or transform skip may beinferred (e.g., based on block characteristics, intra mode, neighbouringblocks) or signalled in the bitstream.

In some examples, a subblock partitioning scheme may be defined for thetransform coding that is different from the prediction blocks specified.A transform block group (TBG) may be defined using transform subblockssimilarly as defined for prediction blocks. One or more techniquesdescribed above may apply to TBGs. For example, when the intrasub-partitioning results in subblocks of size 2×N (prediction blocks maybe of size 2×N), transform block may be defined as 1×N, and eachtransform block independently transform-coded. In another example,prediction blocks may be of size 2×N, prediction block groups may be ofsize 4×N, and transform blocks may be of size 4×N. In another example,prediction blocks may be of size 2×N, prediction block groups may be ofsize 4×N and transform blocks may be of size 2×N for a 8×N coding block.

According to a fifth proposed technique, a video coder may disallowcertain intra modes. Several intra modes are defined (e.g., angular andnon-angular) for intra prediction of blocks. For PBGs, only a subset ofthe intra modes may be specified to be allowed modes. In other words,certain intra modes may be disallowed for PBGs. The determination ofdisallowed intra modes may be reached by considering one or more of thefollowing: (1) split direction, (2) coding block width, height, or afunction of width and the height, (3) prediction block width, height, ora function of width and the height, or (4) above block characteristicsof neighbouring coding blocks, and intra mode used to code theneighbouring coding blocks.

It is understood that the above description (including examples) is inno way limiting the techniques involved and may be combined with othermethods that may not be described in this document. One or moredeterminations or indications may also depend on other conditionsincluding tile group type, partition type, characteristics of one ormore neighbouring blocks (width, height, intra mode, prediction type),etc.

Below are two specific examples of the above techniques.

In a first example, a prediction block group (PBG) is specified to havea minimum width of four samples. For example, vertical partitioning of acoding block of size 4×N may result in subblocks of width 1 or 2 andvertical partitioning of 8×N coding blocks may result in subblocks ofwidth 2 or 4.

FIG. 14 is a block diagram illustrating how a coding block 1402 is splitinto four prediction blocks (PB1-PB4). In an illustrative example,coding block 1402 may be an 8×N coding block that a video coder codeswith ISP and with a vertical split. In this example, prediction blocksPB1, PB2, PB3, and PB4, are of size 2×N. Prediction block groups PBG1and PBG2 are defined with a minimum width (nW) of 4 samples. Forexample, PBG1 is a prediction block group of size 4×N and PBG2 is asecond prediction block group of size 4×N, also. The samples of PBG1 arepredicted from the neighbouring reference samples. In some examples, thepredicted value of samples in PBG1 may not depend on other sample valuesin PBG1. The samples of PBG2 are predicted from the neighbouringreference samples and from the predicted samples of PBG1.

In some examples, the transform block (e.g., transform unit (TU)) may bedefined to be of the same size as the prediction block. For example, thevideo coder may determine a size of the TU to be equal to a size of anyone prediction block included in a PBG.

In some examples, the samples of PBG2 may be predicted from referencesamples from neighbouring coding blocks and from reconstructed samplesbelonging to the area of PBG1, or only from the reference samples fromneighbouring coding blocks.

FIG. 15 is a block diagram illustrating an example where a video codermay code coding block 1502 with ISP and vertical partitioning. In anillustrative example, coding block 1502 may be a 4×N coding block. Insuch examples, the prediction blocks (PB1-PB4) may be of size 1×N. Insome examples, a PBG may be defined to have a width of at least foursamples. In such examples, the video coder may group the four predictionblocks (PB1-PB4) into one prediction block group (e.g., PBG1). As such,PBG1 may be a PBG of size 4×N.

In some cases, the generalized prediction (specifying prediction blocksand prediction block groups described herein) only applies to a subsetof blocks that satisfy one or more criteria, which may depend on blocksize, block shapes, subblock width and heights, intra modes, etc. Forexample, the condition may only apply to coding blocks that generatesubblocks that have a width less than or equal to 2. In other cases, thecondition may apply to subblocks that have a total number of samplesless than or equal to 32 samples.

In another example, the samples of all prediction blocks of a codingblock are predicted from the reconstructed reference samples of theneighbouring coding blocks. In such examples, there is no dependency ofprediction within prediction blocks of the coding unit. In someexamples, the transform unit size may be equal to the size of theprediction block.

FIG. 16 is a conceptual diagram illustrating an example where an M×Ncoding block 1602 is split into four prediction blocks (e.g., PB1-PB4).In the illustrative example, the prediction blocks (PB1-PB4) have a sizeof (M/4)×N. The reference samples 1604 are used as the reference samplesfor the four prediction blocks. There is no dependence between thepredicted samples of one prediction block to the other. For example,there is no dependence between the predicted samples of PB1 and thepredicted samples of PB2, PB3, or PB4.

In some cases, the four predicted subblocks may be predicted inparallel. In another example, the four subblocks (PB1-PB4) may beconsidered together as one block for prediction. For example, the foursubblocks (PB1-PB4) may be considered together as one PBG forprediction.

In some examples, a video coder may predict the samples of some codingblocks using at least one prediction block group having a width of foursamples, and some other coding units are predicted with restricted intramodes. The following restrictions are added in the specification thatapply to the coding of 4×N coding blocks and 8×N coding blocks that arecoded with ISP.

For 4×N coding blocks split vertically with ISP where N>4, only thevertical mode may be allowed for intra prediction. This ensures that theprediction of any of the sub-partitions (1×N sub-partitions in case of4×N coding block with N>8 and 2×N sub-partitions in case of 4×8 codingblock) would be independent of the other sub-partitions in the codingblock. In this way, a video coder may perform prediction of thesub-partitions in parallel or by considering all the sub-partitions as a4×N block for prediction. The intra mode may not be signalled for suchcoding blocks.

FIG. 17 is a conceptual diagram illustrating an example in which a videocoder is only allowed to perform vertical intra (VER Intra) modeprediction for a 4×N coding block 1702 that the video coder codes withISP and vertical split. That is, FIG. 17 shows an example of how thevertically split sub-partitions of a 4×N coding block 1702 may only usevertical intra mode for prediction. In the illustrative example, codingblock 1702 may have a height of at least an 8-sample size (N>8). In suchexamples, coding block 1702 may be split into four 1×N subpartitions(e.g., subblocks, prediction blocks). In any case, the inverse transformoperation is still performed at the 1×N sub-partition level.

For 8×N coding blocks split vertically with ISP where N>4, theprediction of samples is performed on prediction block groups (PBG) foursamples in width. For example, when the coding block size is 8×N (N>4),a video coder may identify two PBGs to each have a size of 4×N. FIG. 18shows an example of how an 8×N sub-partition is split into twoprediction regions (PBG1 or region 1 and PBG2 or region 2) of size 4×Neach, and four transforms of size 2×N each. In the example of FIG. 18 ,a video coder may code an 8×N (N>4) coding block 1802 with ISP andvertical split. In such examples, the video coder may perform predictionon 4×N prediction regions (or PBG). In addition, the transform (SP) sizemay be 2×N. In some examples, the prediction of the region 2 isdependent on the reconstruction of the region 1. Thus, the prediction ofany sample within the coding block 1802 is only dependent onreconstructed samples from neighbouring coding blocks and preceding 4×Nprediction regions.

In FIG. 17 , the four-line residuals of size 1×N and similarly in FIG.18 , within one prediction region (prediction region 1 or 2), and thetwo residual blocks of 2×N can be generated in parallel or together inone step. For example, the vertical inverse transform operation of thetwo 2×N sub-partitions within each prediction region may be performed byconsidering the region as a 4×N block because the vertical transform forall the sub-partitions in ISP is identical. The horizontal transform mayalso be done for the 2×N sub-partitions in parallel, either by separateprocessing, or by considering the two sub-partitions as a single 4×Nblock and the horizontal inverse transform operation as a combination ofthe two smaller horizontal inverse transform operations.

Table 6 below summarizes the partition sizes for prediction sizes andtransform sizes. In some examples, the prediction sizes and transformsizes may be different for other coding block sizes. For example, theprediction and transform sizes may be as specified for ISP in VVC WD4.

TABLE 6 Example partition sizes for prediction sizes and transform sizesCoding Partition sizes Transform block sizes for prediction sizes 4 × 84 × 8 (vertical mode only) 2 × N 4 × N (N > 8) 4 × N (vertical modeonly) 1 × N 8 × N (N > 4) 4xN 2 × N

Changes to relevant sections of the specification in the fifth workingdraft of VVC (hereinafter “VVC WD5” and available atphenix.it-sudparis.eu/jvet/doc_end_user/documents/14_Geneva/wg11/JVET-N1001-v8.zip,are provided below with change marks. Deleted items are shown with

and added items are shown with bold italics.

In some examples, a video encoder may use the following inputs in orderto perform a coding process:

1. a sample location (xTb0, yTb0) specifying the top-left sample of thecurrent transform block relative to the top-left sample of the currentpicture,

2. a variable nTbW specifying the width of the current transform block,

3. a variable nTbH specifying the height of the current transform block,

4. a variable predModeIntra specifying the intra prediction mode, and

5. a variable cIdx specifying the colour component of the current block.

Output of the process is a modified reconstructed picture before in-loopfiltering.

The maximum transform block size maxTbSize is derived using Eq. 2, asfollows:max TbSize=(cIdx==0)? Max TbSizeY: Max TbSizeY/2   (2)

The luma sample location is derived using Eq. 3, as follows:(xTbY,yTbY)=(cIdx==0)?(xTb0,yTb0):(xTb0*2,yTb0*2)   (3)

Depending on maxTbSize, the following applies:

If IntraSubPartitionsSplitType is equal to NO_ISP_SPLIT and nTbW isgreater than maxTbSize or nTbH is greater than maxTbSize, the followingordered steps apply (1. through 5.):

1. The variables newTbW and newTbH are derived as follows:newTbW=(nTbW>max TbSize)?(nTbW/2):nTbW andnewTbH=(nTbH>max TbSize)?(nTbH/2):nTbH.

2. The general decoding process for intra blocks as specified in thisclause is invoked with the location (xTb0, yTb0), the transform blockwidth nTbW set equal to newTbW and the height nTbH set equal to newTbH,the intra prediction mode predModeIntra, and the variable cIdx asinputs, and the output is a modified reconstructed picture beforein-loop filtering.

3. If nTbW is greater than maxTbSize, the general decoding process forintra blocks as specified in this clause is invoked with the location(xTb0, yTb0) set equal to (xTb0+newTbW, yTb0), the transform block widthnTbW set equal to newTbW and the height nTbH set equal to newTbH, theintra prediction mode predModeIntra, and the variable cIdx as inputs,and the output is a modified reconstructed picture before in-loopfiltering.

4. If nTbH is greater than maxTbSize, the general decoding process forintra blocks as specified in this clause is invoked with the location(xTb0, yTb0) set equal to (xTb0, yTb0+newTbH), the transform block widthnTbW set equal to newTbW and the height nTbH set equal to newTbH, theintra prediction mode predModeIntra, and the variable cIdx as inputs,and the output is a modified reconstructed picture before in-loopfiltering.

5. If nTbW is greater than maxTbSize and nTbH is greater than maxTbSize,the general decoding process for intra blocks as specified in thisclause is invoked with the location (xTb0, yTb0) set equal to(xTb0+newTbW, yTb0+newTbH), the transform block width nTbW set equal tonewTbW and the height nTbH set equal to newTbH, the intra predictionmode predModeIntra, and the variable cIdx as inputs, and the output is amodified reconstructed picture before in-loop filtering.

Otherwise, the following ordered steps apply (1. and 2.). The variablesFor example, when a video coder determines thatIntraSubPartitionsSplitType is not equal to ISP_NO_SPLIT (e.g., whenIntraSubPartitionsSplitType is equal to either ISP_VER_SPLIT orISP_HOR_SPLIT) or when nTbW and nTbH are both less than or equal tomaxTbSize, the video coder may derive the following variables.

1. In some examples, a video coder may derive a plurality of variables,including nW, nH, xPartIncrement, yPartIncrement, nPbW and pbFactor. Forexample, the video coder may derive such variables using Eqs. (6)through (11), as follows:nW=IntraSubPartitionsSplitType==ISP_VER_SPLIT?nTbW/NumIntraSubPartitions:nTbW  (6)nH=IntraSubPartitionsSplitType==ISP_HOR_SPLIT?nTbH/NumIntraSubPartitions:nTbH  (7)xPartIncrement=ISP_VER_SPLIT?1;0   (8)yPartIncrement=ISP_HOR_SPLIT?1;0   (9)nPbW=max(4,nW)   (10)pbFactor=nPbW/nW   (11)

2. In addition, variables xPartIdx and yPartIdx are set equal to 0.

In such examples, a video coder may apply the following ordered steps(1. through 4.) successively for i=0 . . . NumIntraSubPartitions−1:

1. The variables xPartIdx, yPartIdx, and xPartPbIdx are updated asfollows:xPartIdx=xPartIdx+xPartIncrement   (12)yPartIdx=yPartIdx+yPartIncrement   (13)xPartPbIdx=xPartIdx % pbFactor   (14)

2. In some examples, when xPartPbIdx is equal to 0, a video coder mayinvoke an intra sample prediction process as specified in clause8.4.5.2. In such examples, video coder may invoke the intra sampleprediction process with the location (xTbCmp, yTbCmp) set equal to(xTb0+nW*xPartIdx, yTb0+nH*yPartIdx), the intra prediction modepredModeIntra, the transform block width nTbW and height nTbH set equalto nPbW and nH, the coding block width nCbW set equal to nTbW and thecoding block height nCbH set equal to nTbH, and the variable cIdx asinputs, and the output is an (nPbW)×(nH) array predSamples (e.g.,prediction sample array).

3. The scaling and transformation process as specified in clause 8.7.2is invoked with the luma location (xTbY, yTbY) set equal to(xTbY+nW*xPartIdx, yTbY+nH*yPartIdx), the variable cIdx, the transformwidth nTbW and the transform height nTbH set equal to nW and nH asinputs, and the output is an (nPbW)×(nH) array resSamples (e.g.,residual sample array). In one example, the output may be an (nW)×(nH)array resSamples.

4. The picture reconstruction process for a colour component asspecified in clause 8.7.5 is invoked with the transform block location(xTbComp, yTbComp) set equal to (xTb0+nW*xPartIdx, yTb0+nH*yPartIdx),the transform block width nTbW, the transform block height nTbH setequal to nW and nH, the variable cIdx, the (nW)×(nH) arraypredSamples[x][y] with x=xPartPbIdx*nW . . . (xPartPbIdx+1)*nW−1, y=0 .. . nH−1, and the (nW)×(nH) array resSamples as inputs, and the outputis a modified reconstructed picture before in-loop filtering.

In some examples, the following changes may restrict intra predictionmode to vertical intra mode for some block sizes. Example changes aremade to the signalling in Section 7.3.7.5 of VVC WD5. Deleted items areshown with

and added items are shown with italics.

TABLE 7 Restrict prediction mode Descriptor coding_unit( x0, y0,cbWidth, cbHeight, treeType ) { if( slice_type != I | |sps_ibc_enabled_flag ) { if( treeType != DUAL_TREE_CHROMA &&  !( cbWidth= = 4 && cbHeight = = 4 && !sps_ibc_enabled_flag ) ) cu_skip_flag[ x0 ][y0 ] ae(v) if( cu_skip_flag[ x0 ][ y0 ] = = 0 && slice_type != I  && !(cbWidth = = 4 && cbHeight = = 4 ) ) pred_mode_flag ae(v) if( ( (slice_type = = I && cu_skip_flag[ x0 ][ y0 ] = =0 ) | | ( slice_type !=I && ( CuPredMode[ x0 ][ y0 ] != MODE_INTRA | | ( cbWidth = = 4 &&cbHeight = = 4 && cu_skip_flag[ x0][ y0 ] = = 0 ) ) ) ) &&sps_ibc_enabled_flag && ( cbWidth != 128 | | cbHeight != 128 ) )pred_mode_ibc_flag ae(v) } if( CuPredMode[ x0 ][ y0 ] = = MODE_INTRA ) {if( sps_pcm_enabled_flag && cbWidth >= MinIpcmCbSizeY && cbWidth <=MaxIpcmCbSizeY && cbHeight >= MinIpcmCbSizeY && cbHeight <=MaxIpcmCbSizeY ) pcm_flag[ x0 ][ y0 ] ae(v) if( pcm_flag[ x0 ][ y0 ] ) {while( !byte_aligned( ) ) pcm_alignment_zero_bit f(1) pcm_sample(cbWidth, cbHeight, treeType) } else { if( treeType = = SINGLE_TREE | |treeType = = DUAL_TREE_LUMA ) { if( cbWidth <= 32 && cbHeight <= 32 )intra_bdpcm_flag[ x0 ][ y0 ] ae(v) if( intra_bdpcm_flag[ x0 ][ y0 ] )intra_bdpcm_dir_flag[ x0 ][ y0 ] ae(v) else { if( sps_mip_enabled_flag&& ( Abs( Log2( cbWidth) − Log2( cbHeight ) ) <= 2 ) &&  cbWidth <=MaxTbSizeY && cbHeight <= MaxTbSizeY ) intra_mip_flag[ x0 ][ y0 ] ae(v)if( intra_mip_flag[ x0 ][ y0 ] ) { intra_mip_mpm_flag[ x0 ][ y0 ] ae(v)if( intra_mip_mpm_flag[ x0 ][ y0 ] ) intra_mip_mpm_idx[ x0 ][ y0 ] ae(v)else intra_mip_mpm_remainder[ x0 ][ y0 ] ae(v) } else { if(sps_mrl_enabled_flag && ( ( y0 % CtbSizeY ) > 0 ) ) intra_luma_ref_idx[x0 ][ y0 ] ae(v) if ( sps_isp_enabled_flag && intra_luma_ref_idx[ x0 ][y0 ] = = 0 && ( cbWidth <= MaxTbSizeY && cbHeight <= MaxTbSizeY ) && (cbWidth * cbHeight > MinTbSizeY * MinTbSizeY ) )intra_subpartitions_mode_flag[ x0 ][ y0 ] ae(v) if(intra_subpartitions_mode_flag[ x0 ][ y0 ] = = 1 && cbWidth <= MaxTbSizeY&& cbHeight <= MaxTbSizeY ) intra_subpartitions_split_flag[ x0 ][ y0 ]ae(v) if( !( IntraSubPartitionsSplitType = = ISP _(—) VER _(—) SPLIT &&cbWidth = = 4 ) ) { if( intra_luma_ref_idx[ x0 ][ y0 ] = = 0 &&intra_subpartitions_mode_flag[ x0 ][ y0 ] = = 0 ) intra_luma_mpm_flag[x0 ][ y0 ] ae(v) if( intra_luma_mpm_flag[ x0 ][ y0 ] ) { if(intra_luma_ref_idx[ x0 ][ y0 ] = = 0 ) intra_luma_not_planar_flag[ x0 ][y0 ] ae(v) if( intra_luma_not_planar_flag[ x0 ][ y0 ] )intra_luma_mpm_idx[ x0 ][ y0 ] ae(v) } else intra_luma_mpm_remainder[ x0][ y0 ] ae(v) } } } } if( treeType = = SINGLE_TREE | | treeType = =DUAL_TREE_CHROMA ) intra_chroma_pred_mode[ x0 ][ y0 ] ae(v) } } else if(treeType != DUAL_TREE_CHROMA ) { /* MODE_INTER or MODE_IBC */ ... }

Example changes are made to the Derivation Process for luma intraprediction mode in Section 8.4.3 of VVC WD5. Deleted items are shownwith

and added items are shown with italics.

For example, a video coder may perform a derivation process for lumaintra prediction mode. Inputs to this process include: (1) a lumalocation (xCb, yCb) specifying the top-left sample of the current lumacoding block relative to the top-left luma sample of the currentpicture, (2) a variable cbWidth specifying the width of the currentcoding block in luma samples, and/or (3) a variable cbHeight specifyingthe height of the current coding block in luma samples.

In this process, based on the inputs above, a video coder may derive theluma intra prediction mode IntraPredModeY[xCb][yCb].

Table 8 specifies the value for the intra prediction modeIntraPredModeY[xCb][yCb ] and the associated names. It should be notedthat intra prediction modes INTRA_LT_CCLM, INTRA_L_CCLM and INTRA_T_CCLMmay only be applicable to chroma components.

TABLE 8 Specification of intra prediction mode and associated namesIntra prediction mode Associated name 0 INTRA_PLANAR 1 INTRA_DC  2 . . .66 INTRA_ANGULAR2 . . . INTRA_ANGULAR66 81 . . . 83 INTRA_LT_CCLM,INTRA_L_CCLM, INTRA_T_CCLM

IntraPredModeY[xCb][yCb] is derived as follows: If BdpcmFlag[xCb][yCb]is equal to 1 or intra_luma_not_planar_flag[xCb][yCb] is equal to 0,IntraPredModeY[xCb][yCb] is set equal to INTRA_PLANAR. Otherwise ifIntraSubPartitionsSplitType is equal to ISP_VER_SPLIT and cbWidth isequal to 4, IntraPredmodeY[xCb][yCb] is set equal to INTRA_ANGULAR50.Otherwise (e.g., intra_luma_not_planar_flag[xCb][yCb] is equal to 1),the neighbouring locations (xNbA, yNbA) and (xNbB, yNbB) are set equalto (xCb−1, yCb+cbHeight−1) and (xCb+cbWidth−1, yCb−1), respectively.

FIGS. 19A and 19B are conceptual diagram illustrating an example QTBTstructure 130, and a corresponding coding tree unit (CTU) 132. The solidlines represent quadtree splitting, and dotted lines indicate binarytree splitting. In each split (i.e., non-leaf) node of the binary tree,one flag is signaled to indicate which splitting type is used (e.g.,horizontal, vertical, etc.). In an illustrative and non-limitingexample, a value of 0 may indicate horizontal splitting and a value of 1may indicate vertical splitting. In another example involving quadtreesplitting, there may be no need to indicate the splitting type becausequadtree nodes, in general, split a block horizontally and verticallyinto 4 sub-blocks (e.g., prediction blocks). In such examples, thesub-blocks may or may be of equal size. Accordingly, video encoder 200may encode, and video decoder 300 may decode, syntax elements (such assplitting information) for a region tree level (i.e., the first level)of QTBT structure 130 (i.e., the solid lines) and syntax elements (suchas splitting information) for a prediction tree level (i.e., the secondlevel) of QTBT structure 130 (i.e., the dashed lines). Video encoder 200may encode, and video decoder 300 may decode, video data, such asprediction and transform data, for CUs represented by terminal leafnodes of QTBT structure 130.

In general, CTU 132 of FIG. 19B may be associated with parametersdefining sizes of blocks corresponding to nodes of QTBT structure 130 atthe first and second levels. These parameters may include a CTU size(representing a size of CTU 132 in samples), a minimum quadtree size(MinQTSize, representing a minimum allowed quadtree leaf node size), amaximum binary tree size (MaxBTSize, representing a maximum allowedbinary tree root node size), a maximum binary tree depth (MaxBTDepth,representing a maximum allowed binary tree depth), and a minimum binarytree size (MinBTSize, representing the minimum allowed binary tree leafnode size).

The root node of a QTBT structure corresponding to a CTU may have fourchild nodes at the first level of the QTBT structure, each of which maybe partitioned according to quadtree partitioning. That is, nodes of thefirst level are either leaf nodes (having no child nodes) or have fourchild nodes. The example of QTBT structure 130 represents such nodes asincluding the parent node and child nodes having solid lines forbranches. If nodes of the first level are not larger than the maximumallowed binary tree root node size (MaxBTSize), the nodes can be furtherpartitioned by respective binary trees. The binary tree splitting of onenode can be iterated until the nodes resulting from the split reach theminimum allowed binary tree leaf node size (MinBTSize) or the maximumallowed binary tree depth (MaxBTDepth). The example of QTBT structure130 represents such nodes as having dashed lines for branches. Thebinary tree leaf node is referred to as a coding unit (CU), which isused for prediction (e.g., intra-picture or inter-picture prediction)and transform, without any further partitioning. As discussed above, CUsmay also be referred to as “video blocks,” “coding blocks,” or “blocks.”

In one example of the QTBT partitioning structure, the CTU size is setas 128×128 (luma samples and two corresponding 64×64 chroma samples),the MinQTSize is set as 16×16, the MaxBTSize is set as 64×64, theMinBTSize (for both width and height) is set as 4, and the MaxBTDepth isset as 4. The quadtree partitioning is applied to the CTU first togenerate quad-tree leaf nodes. The quadtree leaf nodes may have a sizefrom 16×16 (i.e., the MinQTSize) to 128×128 (i.e., the CTU size). If thequadtree leaf node is 128×128, the node will not be further split by thebinary tree, since the size exceeds the MaxBTSize (i.e., 64×64, in thisexample). Otherwise, the quadtree leaf node will be further partitionedby the binary tree. Therefore, the quadtree leaf node is also the rootnode for the binary tree and has the binary tree depth as 0. When thebinary tree depth reaches MaxBTDepth (4, in this example), no furthersplitting is permitted. When the binary tree node has width equal toMinBTSize (4, in this example), it implies that no further verticalsplitting is permitted. Similarly, a binary tree node having a heightequal to MinBTSize implies that no further horizontal splitting ispermitted for that binary tree node. As noted above, leaf nodes of thebinary tree are referred to as CUs, and are further processed accordingto prediction and transform without further partitioning.

FIG. 20 is a block diagram illustrating an example video encoder 200that may perform the techniques of this disclosure. FIG. 20 is providedfor purposes of explanation and should not be considered limiting of thetechniques as broadly exemplified and described in this disclosure. Forpurposes of explanation, this disclosure describes video encoder 200 inthe context of video coding standards such as the HEVC video codingstandard and the H.266 video coding standard in development. However,the techniques of this disclosure are not limited to these video codingstandards, and are applicable generally to video encoding and decoding.

In the example of FIG. 20 , video encoder 200 includes video data memory230, mode selection unit 202, residual generation unit 204, transformprocessing unit 206, quantization unit 208, inverse quantization unit210, inverse transform processing unit 212, reconstruction unit 214,filter unit 216, decoded picture buffer (DPB) 218, and entropy encodingunit 220. Any or all of video data memory 230, mode selection unit 202,residual generation unit 204, transform processing unit 206,quantization unit 208, inverse quantization unit 210, inverse transformprocessing unit 212, reconstruction unit 214, filter unit 216, DPB 218,and entropy encoding unit 220 may be implemented in one or moreprocessors or in processing circuitry. Moreover, video encoder 200 mayinclude additional or alternative processors or processing circuitry toperform these and other functions.

Video data memory 230 may store video data to be encoded by thecomponents of video encoder 200. Video encoder 200 may receive the videodata stored in video data memory 230 from, for example, video source 104(FIG. 1 ). DPB 218 may act as a reference picture memory that storesreference video data for use in prediction of subsequent video data byvideo encoder 200. Video data memory 230 and DPB 218 may be formed byany of a variety of memory devices, such as dynamic random access memory(DRAM), including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM),resistive RAM (RRAM), or other types of memory devices. Video datamemory 230 and DPB 218 may be provided by the same memory device orseparate memory devices. In various examples, video data memory 230 maybe on-chip with other components of video encoder 200, as illustrated,or off-chip relative to those components.

In this disclosure, reference to video data memory 230 should not beinterpreted as being limited to memory internal to video encoder 200,unless specifically described as such, or memory external to videoencoder 200, unless specifically described as such. Rather, reference tovideo data memory 230 should be understood as reference memory thatstores video data that video encoder 200 receives for encoding (e.g.,video data for a current block that is to be encoded). Memory 106 ofFIG. 1 may also provide temporary storage of outputs from the variousunits of video encoder 200.

The various units of FIG. 20 are illustrated to assist withunderstanding the operations performed by video encoder 200. The unitsmay be implemented as fixed-function circuits, programmable circuits, ora combination thereof. Fixed-function circuits refer to circuits thatprovide particular functionality, and are preset on the operations thatcan be performed. Programmable circuits refer to circuits that can beprogrammed to perform various tasks, and provide flexible functionalityin the operations that can be performed. For instance, programmablecircuits may execute software or firmware that cause the programmablecircuits to operate in the manner defined by instructions of thesoftware or firmware. Fixed-function circuits may execute softwareinstructions (e.g., to receive parameters or output parameters), but thetypes of operations that the fixed-function circuits perform aregenerally immutable. In some examples, one or more of the units may bedistinct circuit blocks (fixed-function or programmable), and in someexamples, the one or more units may be integrated circuits.

Video encoder 200 may include arithmetic logic units (ALUs), elementaryfunction units (EFUs), digital circuits, analog circuits, and/orprogrammable cores, formed from programmable circuits. In examples wherethe operations of video encoder 200 are performed using softwareexecuted by the programmable circuits, memory 106 (FIG. 1 ) may storethe object code of the software that video encoder 200 receives andexecutes, or another memory within video encoder 200 (not shown) maystore such instructions.

Video data memory 230 is configured to store received video data. Videoencoder 200 may retrieve a picture of the video data from video datamemory 230 and provide the video data to residual generation unit 204and mode selection unit 202. Video data in video data memory 230 may beraw video data that is to be encoded.

Mode selection unit 202 includes a motion estimation unit 222, motioncompensation unit 224, and an intra-prediction unit 226. Mode selectionunit 202 may include additional functional units to perform videoprediction in accordance with other prediction modes. As examples, modeselection unit 202 may include a palette unit, an intra-block copy unit(which may be part of motion estimation unit 222 and/or motioncompensation unit 224), an affine unit, a linear model (LM) unit, or thelike.

Mode selection unit 202 generally coordinates multiple encoding passesto test combinations of encoding parameters and resultingrate-distortion values for such combinations. The encoding parametersmay include partitioning of CTUs into CUs, prediction modes for the CUs,transform types for residual data of the CUs, quantization parametersfor residual data of the CUs, and so on. Mode selection unit 202 mayultimately select the combination of encoding parameters havingrate-distortion values that are better than the other testedcombinations.

Video encoder 200 may partition a picture retrieved from video datamemory 230 into a series of CTUs, and encapsulate one or more CTUswithin a slice. Mode selection unit 202 may partition a CTU of thepicture in accordance with a tree structure, such as the QTBT structureor the quad-tree structure of HEVC described above. As described above,video encoder 200 may form one or more CUs from partitioning a CTUaccording to the tree structure. Such a CU may also be referred togenerally as a “video block” or “block.”

In general, mode selection unit 202 also controls the components thereof(e.g., motion estimation unit 222, motion compensation unit 224, andintra-prediction unit 226) to generate a prediction block for a currentblock (e.g., a current CU, or in HEVC, the overlapping portion of a PUand a TU). For inter-prediction of a current block, motion estimationunit 222 may perform a motion search to identify one or more closelymatching reference blocks in one or more reference pictures (e.g., oneor more previously coded pictures stored in DPB 218). In particular,motion estimation unit 222 may calculate a value representative of howsimilar a potential reference block is to the current block, e.g.,according to sum of absolute difference (SAD), sum of squareddifferences (SSD), mean absolute difference (MAD), mean squareddifferences (MSD), or the like. Motion estimation unit 222 may generallyperform these calculations using sample-by-sample differences betweenthe current block and the reference block being considered. Motionestimation unit 222 may identify a reference block having a lowest valueresulting from these calculations, indicating a reference block thatmost closely matches the current block.

Motion estimation unit 222 may form one or more motion vectors (MVs)that define the positions of the reference blocks in the referencepictures relative to the position of the current block in a currentpicture. Motion estimation unit 222 may then provide the motion vectorsto motion compensation unit 224. For example, for uni-directionalinter-prediction, motion estimation unit 222 may provide a single motionvector, whereas for bi-directional inter-prediction, motion estimationunit 222 may provide two motion vectors. Motion compensation unit 224may then generate a prediction block using the motion vectors. Forexample, motion compensation unit 224 may retrieve data of the referenceblock using the motion vector. As another example, if the motion vectorhas fractional sample precision, motion compensation unit 224 mayinterpolate values for the prediction block according to one or moreinterpolation filters. Moreover, for bi-directional inter-prediction,motion compensation unit 224 may retrieve data for two reference blocksidentified by respective motion vectors and combine the retrieved data,such as through sample-by-sample averaging or weighted averaging.

As another example, for intra-prediction or intra-prediction coding,intra-prediction unit 226 may generate the prediction block from samplesneighboring the current block. For example, for directional modes,intra-prediction unit 226 may generally mathematically combine values ofneighboring samples and populate these calculated values in the defineddirection across the current block to produce the prediction block. Asanother example, for DC mode, intra-prediction unit 226 may calculate anaverage of the neighboring samples to the current block and generate theprediction block to include this resulting average for each sample ofthe prediction block.

Mode selection unit 202 provides the prediction block to residualgeneration unit 204. Residual generation unit 204 receives a raw,uncoded version of the current block from video data memory 230 and theprediction block from mode selection unit 202. Residual generation unit204 calculates sample-by-sample differences between the current blockand the prediction block. The resulting sample-by-sample differencesdefine a residual block for the current block. In some examples,residual generation unit 204 may also determine differences betweensample values in the residual block to generate a residual block usingresidual differential pulse code modulation (RDPCM). In some examples,residual generation unit 204 may be formed using one or more subtractorcircuits that perform binary subtraction.

In examples where mode selection unit 202 partitions CUs into PUs, eachPU may be associated with a luma prediction unit and correspondingchroma prediction units. Video encoder 200 and video decoder 300 maysupport PUs having various sizes. As indicated above, the size of a CUmay refer to the size of the luma coding block of the CU and the size ofa PU may refer to the size of a luma prediction unit of the PU. Assumingthat the size of a particular CU is 2N×2N, video encoder 200 may supportPU sizes of 2N×2N or N×N for intra prediction, and symmetric PU sizes of2N×2N, 2N×N, N×2N, N×N, or similar for inter prediction. Video encoder200 and video decoder 300 may also support asymmetric partitioning forPU sizes of 2N×nU, 2N×nD, nL×2N, and nR×2N.

In examples where mode selection unit does not further partition a CUinto PUs, each CU may be associated with a luma coding block andcorresponding chroma coding blocks. As above, the size of a CU may referto the size of the luma coding block of the CU. The video encoder 200and video decoder 300 may support CU sizes of 2N×2N, 2N×N, or N×2N.

For other video coding techniques such as an intra-block copy modecoding, an affine-mode coding, and linear model (LM) mode coding, as afew examples, mode selection unit 202, via respective units associatedwith the coding techniques, generates a prediction block for the currentblock being encoded. In some examples, such as palette mode coding, modeselection unit 202 may not generate a prediction block, and insteadgenerate syntax elements that indicate the manner in which toreconstruct the block based on a selected palette. In such modes, modeselection unit 202 may provide these syntax elements to entropy encodingunit 220 to be encoded.

As described above, residual generation unit 204 receives the video datafor the current block and the corresponding prediction block. Residualgeneration unit 204 then generates a residual block for the currentblock. To generate the residual block, residual generation unit 204calculates sample-by-sample differences between the prediction block andthe current block.

Transform processing unit 206 applies one or more transforms to theresidual block to generate a block of transform coefficients (referredto herein as a “transform coefficient block”). Transform processing unit206 may apply various transforms to a residual block to form thetransform coefficient block. For example, transform processing unit 206may apply a discrete cosine transform (DCT), a directional transform, aKarhunen-Loeve transform (KLT), or a conceptually similar transform to aresidual block. In some examples, transform processing unit 206 mayperform multiple transforms to a residual block, e.g., a primarytransform and a secondary transform, such as a rotational transform. Insome examples, transform processing unit 206 does not apply transformsto a residual block.

Quantization unit 208 may quantize the transform coefficients in atransform coefficient block, to produce a quantized transformcoefficient block. Quantization unit 208 may quantize transformcoefficients of a transform coefficient block according to aquantization parameter (QP) value associated with the current block.Video encoder 200 (e.g., via mode selection unit 202) may adjust thedegree of quantization applied to the transform coefficient blocksassociated with the current block by adjusting the QP value associatedwith the CU. Quantization may introduce loss of information, and thus,quantized transform coefficients may have lower precision than theoriginal transform coefficients produced by transform processing unit206.

Inverse quantization unit 210 and inverse transform processing unit 212may apply inverse quantization and inverse transforms to a quantizedtransform coefficient block, respectively, to reconstruct a residualblock from the transform coefficient block. Reconstruction unit 214 mayproduce a reconstructed block corresponding to the current block (albeitpotentially with some degree of distortion) based on the reconstructedresidual block and a prediction block generated by mode selection unit202. For example, reconstruction unit 214 may add samples of thereconstructed residual block to corresponding samples from theprediction block generated by mode selection unit 202 to produce thereconstructed block.

Filter unit 216 may perform one or more filter operations onreconstructed blocks. For example, filter unit 216 may performdeblocking operations to reduce blockiness artifacts along edges of CUs.Operations of filter unit 216 may be skipped, in some examples.

Video encoder 200 stores reconstructed blocks in DPB 218. For instance,in examples where operations of filter unit 216 are not needed,reconstruction unit 214 may store reconstructed blocks to DPB 218. Inexamples where operations of filter unit 216 are needed, filter unit 216may store the filtered reconstructed blocks to DPB 218. Motionestimation unit 222 and motion compensation unit 224 may retrieve areference picture from DPB 218, formed from the reconstructed (andpotentially filtered) blocks, to inter-predict blocks of subsequentlyencoded pictures. In addition, intra-prediction unit 226 may usereconstructed blocks in DPB 218 of a current picture to intra-predictother blocks in the current picture.

In general, entropy encoding unit 220 may entropy encode syntax elementsreceived from other functional components of video encoder 200. Forexample, entropy encoding unit 220 may entropy encode quantizedtransform coefficient blocks from quantization unit 208. As anotherexample, entropy encoding unit 220 may entropy encode prediction syntaxelements (e.g., motion information for inter-prediction or intra-modeinformation for intra-prediction) from mode selection unit 202. Entropyencoding unit 220 may perform one or more entropy encoding operations onthe syntax elements, which are another example of video data, togenerate entropy-encoded data. For example, entropy encoding unit 220may perform a context-adaptive variable length coding (CAVLC) operation,a CABAC operation, a variable-to-variable (V2V) length coding operation,a syntax-based context-adaptive binary arithmetic coding (SBAC)operation, a Probability Interval Partitioning Entropy (PIPE) codingoperation, an Exponential-Golomb encoding operation, or another type ofentropy encoding operation on the data. In some examples, entropyencoding unit 220 may operate in bypass mode where syntax elements arenot entropy encoded.

Video encoder 200 may output a bitstream that includes the entropyencoded syntax elements needed to reconstruct blocks of a slice orpicture. In particular, entropy encoding unit 220 may output thebitstream.

The operations described above are described with respect to a block.Such description should be understood as being operations for a lumacoding block and/or chroma coding blocks. As described above, in someexamples, the luma coding block and chroma coding blocks are luma andchroma components of a CU. In some examples, the luma coding block andthe chroma coding blocks are luma and chroma components of a PU.

In some examples, operations performed with respect to a luma codingblock need not be repeated for the chroma coding blocks. As one example,operations to identify a motion vector (MV) and reference picture for aluma coding block need not be repeated for identifying an MV andreference picture for the chroma blocks. Rather, the MV for the lumacoding block may be scaled to determine the MV for the chroma blocks,and the reference picture may be the same. As another example, theintra-prediction process may be the same for the luma coding blocks andthe chroma coding blocks.

Video encoder 200 may be configured to perform any of the predictionblock grouping techniques described in this disclosure. For instance,intra-prediction unit 226 may perform ISP coding to partition a codingunit (CU) to subblocks (e.g., prediction blocks). In addition,intra-prediction unit 226 may group the subblocks into prediction blockgroups. In this way, video encoder 200 may be considered to include oneor more processors implemented in circuitry and configured to performsplitting (e.g., vertical splitting) of a coding unit (CU) of video datausing intra sub-partition (ISP) to form a set of prediction blocks. Theprediction blocks may include at least a first prediction block and asecond prediction block. Video encoder 200 may then group a plurality ofthe set of prediction blocks into a first prediction block group (PBG).In addition, video encoder 200 may reconstruct samples of predictionblocks included in the first PBG independently of samples of otherprediction blocks included in the first PBG.

FIG. 21 is a block diagram illustrating an example video decoder 300that may perform the techniques of this disclosure. FIG. 21 is providedfor purposes of explanation and is not limiting on the techniques asbroadly exemplified and described in this disclosure. For purposes ofexplanation, this disclosure describes video decoder 300 according tothe techniques of JEM, VVC, and HEVC. However, the techniques of thisdisclosure may be performed by video coding devices that are configuredto other video coding standards.

In the example of FIG. 21 , video decoder 300 includes coded picturebuffer (CPB) memory 320, entropy decoding unit 302, predictionprocessing unit 304, inverse quantization unit 306, inverse transformprocessing unit 308, reconstruction unit 310, filter unit 312, anddecoded picture buffer (DPB) 314. Any or all of CPB memory 320, entropydecoding unit 302, prediction processing unit 304, inverse quantizationunit 306, inverse transform processing unit 308, reconstruction unit310, filter unit 312, and DPB 314 may be implemented in one or moreprocessors or in processing circuitry. Moreover, video decoder 300 mayinclude additional or alternative processors or processing circuitry toperform these and other functions.

Prediction processing unit 304 includes motion compensation unit 316 andintra-prediction unit 318. Prediction processing unit 304 may includeadditional units to perform prediction in accordance with otherprediction modes. As examples, prediction processing unit 304 mayinclude a palette unit, an intra-block copy unit (which may form part ofmotion compensation unit 316), an affine unit, a linear model (LM) unit,or the like. In other examples, video decoder 300 may include more,fewer, or different functional components.

CPB memory 320 may store video data, such as an encoded video bitstream,to be decoded by the components of video decoder 300. The video datastored in CPB memory 320 may be obtained, for example, fromcomputer-readable medium 110 (FIG. 1 ). CPB memory 320 may include a CPBthat stores encoded video data (e.g., syntax elements) from an encodedvideo bitstream. Also, CPB memory 320 may store video data other thansyntax elements of a coded picture, such as temporary data representingoutputs from the various units of video decoder 300. DPB 314 generallystores decoded pictures, which video decoder 300 may output and/or useas reference video data when decoding subsequent data or pictures of theencoded video bitstream. CPB memory 320 and DPB 314 may be formed by anyof a variety of memory devices, such as dynamic random access memory(DRAM), including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM),resistive RAM (RRAM), or other types of memory devices. CPB memory 320and DPB 314 may be provided by the same memory device or separate memorydevices. In various examples, CPB memory 320 may be on-chip with othercomponents of video decoder 300, or off-chip relative to thosecomponents.

Additionally or alternatively, in some examples, video decoder 300 mayretrieve coded video data from memory 120 (FIG. 1 ). That is, memory 120may store data as discussed above with CPB memory 320. Likewise, memory120 may store instructions to be executed by video decoder 300, whensome or all of the functionality of video decoder 300 is implemented insoftware to be executed by processing circuitry of video decoder 300.

The various units shown in FIG. 21 are illustrated to assist withunderstanding the operations performed by video decoder 300. The unitsmay be implemented as fixed-function circuits, programmable circuits, ora combination thereof. Similar to FIG. 20 , fixed-function circuitsrefer to circuits that provide particular functionality, and are preseton the operations that can be performed. Programmable circuits refer tocircuits that can be programmed to perform various tasks, and provideflexible functionality in the operations that can be performed. Forinstance, programmable circuits may execute software or firmware thatcause the programmable circuits to operate in the manner defined byinstructions of the software or firmware. Fixed-function circuits mayexecute software instructions (e.g., to receive parameters or outputparameters), but the types of operations that the fixed-functioncircuits perform are generally immutable. In some examples, one or moreof the units may be distinct circuit blocks (fixed-function orprogrammable), and in some examples, the one or more units may beintegrated circuits.

Video decoder 300 may include ALUs, EFUs, digital circuits, analogcircuits, and/or programmable cores formed from programmable circuits.In examples where the operations of video decoder 300 are performed bysoftware executing on the programmable circuits, on-chip or off-chipmemory may store instructions (e.g., object code) of the software thatvideo decoder 300 receives and executes.

Entropy decoding unit 302 may receive encoded video data from the CPBand entropy decode the video data to reproduce syntax elements.Prediction processing unit 304, inverse quantization unit 306, inversetransform processing unit 308, reconstruction unit 310, and filter unit312 may generate decoded video data based on the syntax elementsextracted from the bitstream.

In general, video decoder 300 reconstructs a picture on a block-by-blockbasis. Video decoder 300 may perform a reconstruction operation on eachblock individually (where the block currently being reconstructed, i.e.,decoded, may be referred to as a “current block”).

Entropy decoding unit 302 may entropy decode syntax elements definingquantized transform coefficients of a quantized transform coefficientblock, as well as transform information, such as a quantizationparameter (QP) and/or transform mode indication(s). Inverse quantizationunit 306 may use the QP associated with the quantized transformcoefficient block to determine a degree of quantization and, likewise, adegree of inverse quantization for inverse quantization unit 306 toapply. Inverse quantization unit 306 may, for example, perform a bitwiseleft-shift operation to inverse quantize the quantized transformcoefficients. Inverse quantization unit 306 may thereby form a transformcoefficient block including transform coefficients.

After inverse quantization unit 306 forms the transform coefficientblock, inverse transform processing unit 308 may apply one or moreinverse transforms to the transform coefficient block to generate aresidual block associated with the current block. For example, inversetransform processing unit 308 may apply an inverse DCT, an inverseinteger transform, an inverse Karhunen-Loeve transform (KLT), an inverserotational transform, an inverse directional transform, or anotherinverse transform to the transform coefficient block.

Furthermore, prediction processing unit 304 generates a prediction blockaccording to prediction information syntax elements that were entropydecoded by entropy decoding unit 302. For example, if the predictioninformation syntax elements indicate that the current block isinter-predicted, motion compensation unit 316 may generate theprediction block. In this case, the prediction information syntaxelements may indicate a reference picture in DPB 314 from which toretrieve a reference block, as well as a motion vector identifying alocation of the reference block in the reference picture relative to thelocation of the current block in the current picture. Motioncompensation unit 316 may generally perform the inter-prediction processin a manner that is substantially similar to that described with respectto motion compensation unit 224 (FIG. 20 ).

As another example, if the prediction information syntax elementsindicate that the current block is intra-predicted, intra-predictionunit 318 may generate the prediction block according to anintra-prediction mode indicated by the prediction information syntaxelements. Again, intra-prediction unit 318 may generally perform theintra-prediction process in a manner that is substantially similar tothat described with respect to intra-prediction unit 226 (FIG. 20 ).Intra-prediction unit 318 may retrieve data of neighboring samples tothe current block from DPB 314.

Reconstruction unit 310 may reconstruct the current block using theprediction block and the residual block. For example, reconstructionunit 310 may add samples of the residual block to corresponding samplesof the prediction block to reconstruct the current block.

Filter unit 312 may perform one or more filter operations onreconstructed blocks. For example, filter unit 312 may performdeblocking operations to reduce blockiness artifacts along edges of thereconstructed blocks. Operations of filter unit 312 are not necessarilyperformed in all examples.

Video decoder 300 may store the reconstructed blocks in DPB 314. Forinstance, in examples where operations of filter unit 312 are notperformed, reconstruction unit 310 may store reconstructed blocks to DPB314. In examples where operations of filter unit 312 are performed,filter unit 312 may store the filtered reconstructed blocks to DPB 314.As discussed above, DPB 314 may provide reference information, such assamples of a current picture for intra-prediction and previously decodedpictures for subsequent motion compensation, to prediction processingunit 304. Moreover, video decoder 300 may output decoded pictures fromDPB 314 for subsequent presentation on a display device, such as displaydevice 118 of FIG. 1 .

Video decoder 300 may be configured to perform any of the predictionblock grouping techniques described in this disclosure. For instance,intra-prediction unit 318 may perform ISP coding to partition a codingunit (CU) to subblocks (e.g., prediction blocks). In addition,intra-prediction unit 318 may group the subblocks into prediction blockgroups. In this way, video decoder 300 may be considered to include oneor more processors implemented in circuitry and configured to performsplitting (e.g., vertical splitting) of a coding unit (CU) of video datausing intra sub-partition (ISP) to form a set of prediction blocks. Theprediction blocks may include at least a first prediction block and asecond prediction block. Video decoder 300 may group a plurality of theset of prediction blocks into a first prediction block group (PBG). Inaddition, video decoder 300 may reconstruct samples of prediction blocksincluded in the first PBG independently of samples of other predictionblocks included in the first PBG.

FIG. 22 is a flowchart illustrating an example method for encoding acurrent block. The current block may comprise a current CU. Althoughdescribed with respect to video encoder 200 (FIGS. 1 and 20 ), it shouldbe understood that other devices may be configured to perform a methodsimilar to that of FIG. 22 .

In this example, video encoder 200 initially predicts the current block(350). For example, video encoder 200 may form a prediction block forthe current block. Video encoder 200 may then calculate a residual blockfor the current block (352). In other words, video encoder 200 maycalculate luma residual data and chroma residual data for the currentblock. To calculate a residual block, video encoder 200 may calculate adifference between the original, uncoded block and the prediction blockfor the current block. Video encoder 200 may then transform and quantizecoefficients of the residual block (354). Next, video encoder 200 mayscan the quantized transform coefficients of the residual block (356).During the scan, or following the scan, video encoder 200 may entropyencode the transform coefficients (358). For example, video encoder 200may encode the transform coefficients using CAVLC or CABAC. Videoencoder 200 may then output the entropy encoded data of the block (360).Video encoder 200 may implement a reconstruction loop. For instance,video encoder 200 may implement a reconstruction loop to reconstructcoded video data using the prediction block grouping techniquesdescribed in this disclosure.

FIG. 23 is a flowchart illustrating an example method for decoding acurrent block of video data. The current block may comprise a currentCU. Although described with respect to video decoder 300 (FIGS. 1 and 21), it should be understood that other devices may be configured toperform a method similar to that of FIG. 23 .

Video decoder 300 may receive entropy coded data for the current block,such as entropy coded prediction information and entropy coded data fortransform coefficients of a residual block corresponding to the currentblock (370). Video decoder 300 may entropy decode the entropy coded datato determine prediction information for the current block and toreproduce transform coefficients of the residual block (372). Videodecoder 300 may predict the current block (374), e.g., using an intra-or inter-prediction mode as indicated by the prediction information forthe current block, to calculate a prediction block for the currentblock. Video decoder 300 may then inverse scan the reproduced transformcoefficients (376), to create a block of quantized transformcoefficients. Video decoder 300 may then inverse quantize and inversetransform the coefficients to produce a residual block (378). Videodecoder 300 may ultimately decode the current block by combining theprediction block and the residual block (380). Video decoder 300 mayreconstruct samples of the current block using the prediction blockgrouping techniques described in this disclosure.

FIG. 24 is a flowchart illustrating an example method for performingprediction block grouping on a current block of video data. The currentblock may comprise a current CU. Although described with respect tovideo encoder 200 (FIGS. 1 and 20 ) and/or video decoder 300 (FIGS. 1and 21 ), it should be understood that other devices may be configuredto perform a method similar to that of FIG. 24 .

In some examples, a video coder (e.g., video encoder 200 and/or videodecoder 300) may perform splitting (e.g., vertical splitting) of acoding unit (CU) of video data using intra sub-partition (ISP). Forexample, the video coder may perform vertical and/or horizontalsplitting of a first CU to form a set of prediction blocks (2402). In anillustrative example, the video coder may perform vertical splitting ofa first CU to form a set of prediction blocks. In any case, the set ofprediction blocks may include all prediction blocks of the CU or only asubset of the prediction blocks of the CU. That is, the video coder mayform a set of prediction blocks that does not necessarily include allprediction blocks of the CU. For example, the video coder may performsplitting of the first CU to form a first set of prediction blocks andone or more other sets of prediction blocks. The one or more other setsof prediction blocks may be mutually exclusive of one another and of thefirst set of prediction blocks. In some examples, the sets of predictionblocks may overlap with one another to some extent, such that oneportion of a CU may be partitioned into more than one prediction block.While various techniques of this disclosure describe video coderspartitioning CUs into a set of prediction blocks, the techniques of thisdisclosure are not so limited, and a video coder may partition a CU intomultiple sets of prediction blocks.

In some examples, the prediction blocks of a set of prediction blocksmay include at least a first prediction block and a second predictionblock (e.g., at least two vertical prediction blocks). In some examples,the first prediction block comprises a sample width of less than orequal to two. For example, the first prediction block may have a samplewidth of 1 or 2. In addition, the second prediction block may have asample width of 1 or 2. In some examples, the first prediction block andthe second prediction block may have a same width. For example, thefirst prediction block and the second prediction block may both have asample width of 2. In some examples, however, the first prediction blockand the second prediction block may have different widths, such as firstprediction block having a width of 2 and the second prediction blockhaving a width of 1. In such examples, a third prediction block may beincluded in the first PBG, such that the first PBG may have a width ofat least a 4 sample size.

In some examples, the video coder may group, from a set of predictionblocks, a plurality of prediction blocks into a first prediction blockgroup (PBG) (2404). In an example, the video coder may group one or moreprediction blocks from a first set of prediction blocks into a firstPBG. For example, the video coder may group, from the first set ofprediction blocks, a first prediction block and a second predictionblock into a first PBG. In some example, the first PBG may only includeone prediction block from the first set of prediction blocks.

In some examples, the video coder may group a first set of predictionblocks of a CU to form one or more PBGs. In such examples, the videocoder may group the PBGs similarly or differently relative to one ormore other sets of prediction blocks of the same CU. For example, thevideo coder may group, into a first PBG, at least one verticalprediction block from a first set of prediction blocks and at least onehorizontal prediction block from the same set of prediction blocks. Insuch examples, the video coder may group, into a second PBG, at leastone vertical prediction block from a second set of prediction blocks andat least one horizontal prediction block from the second set ofprediction blocks. In some examples, however, the video coder mayinstead group, into the second PBG, only vertical prediction blocks fromthe second set of prediction blocks and/or only horizontal predictionblocks from the second set of prediction blocks, where the first set ofprediction blocks includes only vertical subblocks, only horizontalsubblocks, a mix of vertical and horizontal blocks, etc., and the secondset of prediction blocks includes only vertical subblocks, onlyhorizontal subblocks, a mix of vertical and horizontal blocks, etc. Insome examples, the video coder may only group a particular type ofprediction blocks into PBGs. In a non-limiting example involving a CUpartitioned using multiple split types (e.g., vertical and horizontalsplitting of a single CU), the video coder may only group, into PBGs,those prediction blocks that are the result of the vertical splittingpart of the overall splitting scheme. As such, the video coder mayforego grouping prediction blocks that the video coder partitioneddifferently (e.g., using horizontal splitting). In another example, thevideo coder may form groups that include a vertical prediction block andone or more horizontal prediction blocks that neighbor the verticalprediction block. The video coder may additionally form groups includingany leftover prediction blocks (e.g., non-neighboring prediction blocks)or may determine not to form groups for certain prediction blocks in theone or more sets of prediction blocks of a CU. In addition, the videocoder may only group prediction blocks that are of a particular size(e.g., a size that satisfies a predefined size threshold).

In some examples, a PBG may be specified to have a minimum dimension ofnW samples. For example, the PBG may have a minimum width of foursamples (e.g., nW=4). That is, the first PBG may have a sample sizewidth of at least four. In such cases, if the dimension size of thesubblocks is less than nW, then one or more prediction blocks that areadjacent may be treated as one PBG of dimension size nW samples. Thedimension size may apply to the height, width or a function derived fromthe height and width (e.g., width*height).

The video coder may reconstruct samples of prediction blocks included inthe first PBG independently of samples of other prediction blocksincluded in the first PBG (2406). A video coder may reconstruct samplesof prediction blocks independently of other prediction blocks within aPBG when the video coder reconstructs one sample from one subblockwithout accessing, reading, or performing other memory access proceduresto identify or reconstruct samples in another subblock within the samePBG. As a corollary, a video coder may reconstruct samples from onegroup that are dependent on the reconstruction of samples from anothergroup when the video coder performs a sequential operation. That is, thesequential operation involves first reconstructing one or more samplesfrom a first group and then reconstructing one or more samples fromanother group based on the samples from the first group. In someinstances, independent reconstruction may involve a parallel procedurein which a video coder reconstructs samples from a first predictionblock of a first PBG in parallel with, or at substantially the same time(e.g., without a necessary time delay), as the video coder reconstructssamples from a second prediction block that is in the same first PBG.

In an example where a first PBG includes a first prediction block and asecond prediction block, the video coder may reconstruct samples of thefirst prediction block independently of samples of the second predictionblock. In some examples, when certain filtering operations are appliedon predicted blocks that may result in a dependence between thepredicted value of samples within a PBG, the restriction of predictiondependence of samples within a PBG may still be considered valid as longas the regular intra prediction (without the post-prediction filteringoperations) do not result in a dependence. For example, the restrictionmay not apply to dependence due to PDPC, boundary filtering or othersimilar operations.

In some examples, a video coder may form a second PBG, such that asingle CU includes at least two PBGs. For example, the video coder maygroup one or more prediction blocks into a first PBG. Similarly, thevideo coder may group a second set of prediction blocks into a secondPBG. The prediction blocks of each PBG may include a non-overlapping setof prediction blocks. That is, the prediction blocks of one PBG may bemutually exclusive of the prediction blocks of another PBG. In any case,the video coder may reconstruct samples of prediction blocks included inthe second PBG independently of samples of other prediction blocksincluded in the second PBG.

In addition, the video coder may, in some examples, reconstruct samplesof prediction blocks included in a second PBG based on samples of theprediction blocks included in the first PBG. That is, the reconstructionof samples in one PBG may be dependent on the reconstruction of samplesin another PBG. In such examples, the video coder may still maintain theprediction independence of samples within a single PBG. In someexamples, the prediction of samples included in the second PBG may notbe based on the prediction blocks included in the first PBG, but insteadmay be based on the prediction blocks included in another CU, such as aneighboring CU.

In some examples, the video coder may divide the CU into one or moretransform blocks. In such examples, a size of the transform block may beequal to a size of the first prediction block or any one predictionblock of the CU. In some examples, the size of the transform block maybe less than the size of the smallest prediction block included in thefirst PBG. Furthermore, the size of the transform block may be less thanthe size of the aggregate total size of the prediction blocks includedin the first PBG. In another example, the transform block may have asize that is equal to the size of the PBG. In any case, a firsttransform block associated with the first PBG may have a size equal tothe size of a first prediction block or a second prediction blockincluded in the first PBG.

In some examples, a video coder may generate a prediction sample arrayhaving a width equal to an aggregated width of at least the firstprediction block and the second prediction block. For example, the videocoder may output an array of size (nPbW)×(nH). In some examples, thevideo coder may, when reconstructing samples of prediction blocks,generate the prediction sample array having a width equal to anaggregated width of at least the first prediction block and the secondprediction block.

In some examples, a video coder may determine a quantity of predictionblock groups based on a width of the coding unit and a width of thefirst prediction block. The video coder may code the CU in accordancewith the quantity of prediction block groups such that the CU comprisesthe quantity of prediction block groups. The quantity of predictionblock groups may be referred to herein as pbFactor (e.g., predictionblock group factor). In such instances, the video coder may calculatepbFactor by dividing the width of the CU by the width of a predictionblock group. The width of a PBG may be predefined to have at least awidth of a sample size of 4.

In some examples, the video coder may calculate pbFactor by dividing thesubblock width (e.g., the prediction block width) by the width of thePBG (e.g., the minimum width of a PBG). For example, when the subblockhas a width of 1 or 2, the video coder may calculate pbFactor bydividing the subblock width by the width of the PBG, such as by dividingby four when the PBG has a sample width of four. In another example,when the width of the CU is 8 sample sizes and the prediction blockgroup is 4 sample sizes, the video coder may calculate pbFactor to equal2. That is, the number or quantity of PBGs for the CU may be equal to 2PBGs.

In some examples, when the width of a prediction block is four or more,the prediction block width and the PBG width may have the same width.For example, a video coder may group a prediction block of width fourinto a PBG of width four. That is, a video coder may group a predictionblock of width X into a single PBG of the same width, where X is greaterthan or equal to a predefined threshold number. In a non-limiting andillustrative example, the predefined threshold number may be four. Inanother illustrative example, the video coder may group a firstprediction block of width four and a second prediction block of widtheight into a single PBG of width eight. In such examples, the videocoder may only include a portion of the second prediction block into thePBG group, such as by further partitioning the second prediction blockin order to achieve a particular PBG group size. While varioustechniques of this disclosure discuss width size in certain contexts,the techniques of this disclosure are not so limited, and it will beunderstood that other constraints and/or considerations may similarlyapply to the grouping of prediction blocks into PBGs. For example, thePBG may have a particular height limitation (e.g., a four sample sizeheight) or in another example, a combination of height and widthconstraints and/or considerations may similarly apply to the grouping ofprediction blocks into PBGs (e.g., a PBG may have a minimum H×W size ofM×N, where M and N may not necessarily be equal.

As discussed above, the example method of FIG. 24 as described may beperformed by one or more processors implemented in circuitry. The one ormore processors may be included in a video encoder or in some examples,may be included in a video decoder.

Illustrative Examples of the Disclosure Include

Example 1: A method of coding video data, the method comprising:performing splitting of a coding unit (CU) of video data using intrasub-partition (ISP) to form a set of prediction blocks, the predictionblocks including at least a first prediction block and a secondprediction block; grouping a plurality of prediction blocks from the setof prediction blocks into a first prediction block group (PBG); andreconstructing samples of prediction blocks included in the first PBGindependently of samples of other prediction blocks included in thefirst PBG.

Example 2: A method according to Example 1, wherein grouping theplurality of the set of prediction blocks into the first PBG comprisesgrouping a first plurality of the set of prediction blocks into thefirst PBG, the method further comprising: grouping a second plurality ofthe set of prediction blocks into a second PBG, the second plurality ofthe set of prediction blocks and the first plurality of the set ofprediction blocks including a non-overlapping set of prediction blocks;and reconstructing samples of prediction blocks included in the secondPBG independently of samples of other prediction blocks included in thesecond PBG.

Example 3: A method according to Example 2, wherein reconstructingsamples of prediction blocks included in the second PBG comprisesreconstructing samples of prediction blocks included in the second PBGbased on samples of the prediction blocks included in the first PBG.

Example 4: A method according to any of Examples 1 through 3, whereinthe first PBG comprises a sample size width of at least four.

Example 5: A method according to any of Examples 1 through 4, wherein afirst transform block associated with the first PBG has a size equal tothe size of the first prediction block.

Example 6: A method according to any of Examples 1 through 5, whereinreconstructing samples of prediction blocks comprises: generating aprediction sample array having a width equal to an aggregated width ofat least the first prediction block and the second prediction block.

Example 7: A method according to any of Examples 1 through 6, whereinthe first prediction block comprises a sample width of less than orequal to two.

Example 8: A method according to Example 7, wherein the first predictionblock and the second prediction block are of a same width.

Example 9: A method according to any of Examples 1 through 8, furthercomprising: determining a quantity of prediction block groups based on awidth of the coding unit and a width of the first prediction block;coding the CU in accordance with the quantity of prediction block groupssuch that the CU comprises the quantity of prediction block groups.

Example 10: A method according to any of Examples 1 through 9, whereinthe method is performed by one or more processors.

Example 11: A method according to any of Examples 1 through 10, furthercomprising: encoding, in a coded video bitstream, one or more syntaxelements that represent prediction information for the CU of video dataand one or more syntax elements that represent residual data for the CUof video data.

Example 12: A method according to any of Examples 1 through 11, furthercomprising: decoding, from a coded video bitstream, one or more syntaxelements that represent prediction information for the CU of video dataand one or more syntax elements that represent residual data for the CUof video data.

Example 13: A method according to any of Examples 1 through 12, whereinperforming splitting of the CU includes splitting the CU using one ormore of vertical splitting, horizontal splitting, or a combination ofhorizontal and vertical splitting.

Example 14: A device for coding video data, the device comprising one ormore means for performing the methods of any of Examples 1 through 13.For example, the device of Example 14 may include a memory configured tostore video data; and one or more processors implemented in circuitryand configured to: perform splitting of a coding unit (CU) of video datausing intra sub-partition (ISP) to form a set of prediction blocks, theprediction blocks including at least a first prediction block and asecond prediction block; group a plurality of prediction blocks from theset of prediction blocks into a first prediction block group (PBG); andreconstruct samples of prediction blocks included in the first PBGindependently of samples of other prediction blocks included in thefirst PBG.

Example 15: A device according to Example 14, wherein to group theplurality of the set of prediction blocks into the first PBG, the one ormore processors are configured to group a first plurality of the set ofprediction blocks into the first PBG, and wherein the one or moreprocessors are further configured to: group a second plurality of theset of prediction blocks into a second PBG, the second plurality of theset of prediction blocks and the first plurality of the set ofprediction blocks including a non-overlapping set of prediction blocks;and reconstruct samples of prediction blocks included in the second PBGindependently of samples of other prediction blocks included in thesecond PBG.

Example 16: A device according to Example 15, wherein to reconstructsamples of prediction blocks included in the second PBG, the one or moreprocessors are configured to reconstruct samples of prediction blocksincluded in the second PBG based on samples of the prediction blocksincluded in the first PBG.

Example 17: A device according to any of Examples 14 through 16, whereinthe first PBG comprises a sample size width of at least four.

Example 18: A device according to any of Examples 14 through 17, whereina first transform block associated with the first PBG has a size equalto the size of the first prediction block.

Example 19: A device according to any of Examples 14 through 18, whereinto reconstruct the samples of prediction blocks, the one or moreprocessors are configured to: generate a prediction sample array havinga width equal to an aggregated width of at least the first predictionblock and the second prediction block.

Example 20: A device according to any of Examples 14 through 19, whereinthe first prediction block comprises a sample width of less than orequal to two.

Example 21: A device according to Example 20, wherein the firstprediction block and the second prediction block are of a same width.

Example 22: A device according to any of Examples 14 through 21, whereinthe one or more processors are further configured to: determine aquantity of prediction block groups based on a width of the coding unitand a width of the first prediction block; and code the CU in accordancewith the quantity of prediction block groups such that the CU comprisesthe quantity of prediction block groups.

Example 23: A device according to any of Examples 14 through 22, whereinthe one or more processors are included in a video encoder.

Example 24: A device according to any of Examples 14 through 23, whereinthe one or more processors are included in a video decoder.

Example 25: A device according to any of Examples 14 through 24, whereinto perform splitting of the CU, the one or more processors areconfigured to split the CU using vertical splitting.

In some implementations, the above-described examples 1-13 and/or 14-25can be implemented using a computer-readable storage medium storinginstructions that when executed cause one or more processors of a deviceto perform some or all of the various operations. For example, acomputer-readable storage medium can be provided storing instructionsthat when executed cause one or more processors of a device for decodingand/or encoding video data to: perform splitting of a coding unit (CU)of video data using intra sub-partition (ISP) to form a set ofprediction blocks, the prediction blocks including at least a firstprediction block and a second prediction block; group a plurality ofprediction blocks from the set of prediction blocks into a firstprediction block group (PBG); and reconstruct samples of predictionblocks included in the first PBG independently of samples of otherprediction blocks included in the first PBG.

In some implementations, the above-described examples 1-13 and/or 14-25can be implemented using an apparatus comprising one or more means forperforming some or all of the various operations. For example, anapparatus for encoding video data includes: means for performingvertical splitting of a coding unit (CU) of video data using intrasub-partition (ISP) to form a set of prediction blocks, the predictionblocks including at least a first prediction block and a secondprediction block; means for grouping a plurality of prediction blocksfrom the set of prediction blocks into a first prediction block group(PBG); and means for reconstructing samples of prediction blocksincluded in the first PBG independently of samples of other predictionblocks included in the first PBG.

Example 26: A method of coding video data, the method comprising: codingan indication of a performance of generalized prediction for a currentcoding block of video data; splitting, based on the indication, thecurrent coding block into one or more prediction blocks; defining, basedon the one or more prediction blocks, one or more prediction blockgroups (PBGs); determining a coding order of the PBGs; and coding thePBGs in the coding order.

Example 27: A method according to Example 26, wherein coding the PBGscomprises: obtaining a predicted value for a sample in a current PBGbased on available reference samples for the current PBG and an intramode used for prediction.

Example 28: A method according to any of Examples 26 or 27, wherein:splitting the current coding block into one or more prediction blockscomprises splitting the current coding block into prediction blocks ofsize 4×N, only a vertical intra prediction mode is allowed forprediction of the prediction blocks and an indication of an intraprediction mode for the prediction blocks is not signaled in a bitstreamthat includes an encoded representation of the video data.

Example 29: A method according to any of Examples 26 through 28, whereincoding comprises decoding.

Example 30: A method according to any of Examples 26 through 28, whereincoding comprises encoding.

Example 31: A device for coding video data, the device comprising one ormore means for performing the methods of any of Examples 26 through 30.For example, the device of Example 31 may include one or more processorsconfigured to: code an indication of a performance of generalizedprediction for a current coding block of video data; split, based on theindication, the current coding block into one or more prediction blocks;define, based on the one or more prediction blocks, one or moreprediction block groups (PBGs); determine a coding order of the PBGs;and code the PBGs in the coding order.

Example 32: A device according to Example 31, wherein the one or moremeans comprise one or more processors implemented in circuitry.

Example 33: A device according to any of Examples 31 or 32, furthercomprising a memory to store the video data.

Example 34: A device according to any of Examples 31 through 33, furthercomprising a display configured to display decoded video data.

Example 35: A device according to any of Examples 31 through 34, whereinthe device comprises one or more of a camera, a computer, a mobiledevice, a broadcast receiver device, or a set-top box.

Example 36: A device according to any of Examples 31 through 35, whereinthe device comprises a video decoder.

Example 37: A device according to any of Examples 31 through 36, whereinthe device comprises a video encoder.

In some implementations, the above-described examples 26-30 and/or 31-37can be implemented using a computer-readable storage medium storinginstructions that when executed cause one or more processors of a deviceto perform some or all of the various operations. For example, acomputer-readable storage medium can be provided storing instructionsthat when executed cause one or more processors of a device for decodingand/or encoding video data to: code an indication of a performance ofgeneralized prediction for a current coding block of video data; split,based on the indication, the current coding block into one or moreprediction blocks; define, based on the one or more prediction blocks,one or more prediction block groups (PBGs); determine a coding order ofthe PBGs; and code the PBGs in the coding order.

In some implementations, the above-described examples 26-30 and/or 31-37can be implemented using an apparatus comprising one or more means forperforming some or all of the various operations. For example, anapparatus for encoding video data includes: means for coding anindication of the performance of generalized prediction for a currentcoding block of video data; means for splitting, based on theindication, the current coding block into one or more prediction blocks;means for defining, based on the one or more prediction blocks, one ormore prediction block groups (PBGs); means for determining a codingorder of the PBGs; and means for coding the PBGs in the coding order.

It is to be recognized that depending on the example, certain acts orevents of any of the techniques described herein can be performed in adifferent sequence, may be added, merged, or left out altogether (e.g.,not all described acts or events are necessary for the practice of thetechniques). Moreover, in certain examples, acts or events may beperformed concurrently, e.g., through multi-threaded processing,interrupt processing, or multiple processors, rather than sequentially.

In one or more examples, the functions described may be implemented inhardware, software, firmware, or any combination thereof. If implementedin software, the functions may be stored on or transmitted over as oneor more instructions or code on a computer-readable medium and executedby a hardware-based processing unit. Computer-readable media may includecomputer-readable storage media, which corresponds to a tangible mediumsuch as data storage media, or communication media including any mediumthat facilitates transfer of a computer program from one place toanother, e.g., according to a communication protocol. In this manner,computer-readable media generally may correspond to (1) tangiblecomputer-readable storage media which is non-transitory or (2) acommunication medium such as a signal or carrier wave. Data storagemedia may be any available media that can be accessed by one or morecomputers or one or more processors to retrieve instructions, codeand/or data structures for implementation of the techniques described inthis disclosure. A computer program product may include acomputer-readable medium.

By way of example, and not limitation, such computer-readable storagemedia can comprise RAM, ROM, EEPROM, CD-ROM or other optical diskstorage, magnetic disk storage, or other magnetic storage devices, flashmemory, or any other medium that can be used to store desired programcode in the form of instructions or data structures and that can beaccessed by a computer. Also, any connection is properly termed acomputer-readable medium. For example, if instructions are transmittedfrom a website, server, or other remote source using a coaxial cable,fiber optic cable, twisted pair, digital subscriber line (DSL), orwireless technologies such as infrared, radio, and microwave, then thecoaxial cable, fiber optic cable, twisted pair, DSL, or wirelesstechnologies such as infrared, radio, and microwave are included in thedefinition of medium. It should be understood, however, thatcomputer-readable storage media and data storage media do not includeconnections, carrier waves, signals, or other transitory media, but areinstead directed to non-transitory, tangible storage media. Disk anddisc, as used herein, includes compact disc (CD), laser disc, opticaldisc, digital versatile disc (DVD), floppy disk and Blu-ray disc, wheredisks usually reproduce data magnetically, while discs reproduce dataoptically with lasers. Combinations of the above should also be includedwithin the scope of computer-readable media.

Instructions may be executed by one or more processors, such as one ormore digital signal processors (DSPs), general purpose microprocessors,application specific integrated circuits (ASICs), field programmablegate arrays (FPGAs), or other equivalent integrated or discrete logiccircuitry. Accordingly, the terms “processor” and “processing circuity,”as used herein may refer to any of the foregoing structures or any otherstructure suitable for implementation of the techniques describedherein. In addition, in some aspects, the functionality described hereinmay be provided within dedicated hardware and/or software modulesconfigured for encoding and decoding, or incorporated in a combinedcodec. Also, the techniques could be fully implemented in one or morecircuits or logic elements.

The techniques of this disclosure may be implemented in a wide varietyof devices or apparatuses, including a wireless handset, an integratedcircuit (IC) or a set of ICs (e.g., a chip set). Various components,modules, or units are described in this disclosure to emphasizefunctional aspects of devices configured to perform the disclosedtechniques, but do not necessarily require realization by differenthardware units. Rather, as described above, various units may becombined in a codec hardware unit or provided by a collection ofinteroperative hardware units, including one or more processors asdescribed above, in conjunction with suitable software and/or firmware.

Various examples have been described. These and other examples arewithin the scope of the following claims.

What is claimed is:
 1. A method of coding video data, the methodcomprising: coding, via a coded video bitstream, a first syntax elementspecifying whether a coding unit (CU) of video data is split using intrasub-partition (ISP); coding, via the coded video bitstream and where thefirst syntax element specifies that the CU is split using ISP, a secondsyntax element specifying a split type for the CU; and responsive to thesplit type being vertical split or horizontal split: performingsplitting of the CU to form a set of prediction blocks, the predictionblocks including at least a first prediction block and a secondprediction block; grouping a plurality of prediction blocks from the setof prediction blocks into a first prediction block group (PBG); andreconstructing samples of prediction blocks included in the first PBGindependently of samples of other prediction blocks included in thefirst PBG.
 2. The method of claim 1, wherein grouping the plurality ofthe set of prediction blocks into the first PBG comprises grouping afirst plurality of the set of prediction blocks into the first PBG, themethod further comprising: grouping a second plurality of the set ofprediction blocks into a second PBG, the second plurality of the set ofprediction blocks and the first plurality of the set of predictionblocks including a non-overlapping set of prediction blocks; andreconstructing samples of prediction blocks included in the second PBGindependently of samples of other prediction blocks included in thesecond PBG.
 3. The method of claim 2, wherein reconstructing samples ofprediction blocks included in the second PBG comprises reconstructingsamples of prediction blocks included in the second PBG based on samplesof the prediction blocks included in the first PBG.
 4. The method ofclaim 1, wherein the first PBG comprises a sample size width of at leastfour.
 5. The method of claim 1, wherein a first transform blockassociated with the first PBG has a size equal to the size of the firstprediction block.
 6. The method of claim 1, wherein reconstructingsamples of prediction blocks comprises: generating a prediction samplearray having a width equal to an aggregated width of at least the firstprediction block and the second prediction block.
 7. The method of claim1, wherein the first prediction block comprises a sample width of lessthan or equal to two.
 8. The method of claim 7, wherein the firstprediction block and the second prediction block are of a same width. 9.The method of claim 1, further comprising: determining a quantity ofprediction block groups based on a width of the coding unit and a widthof the first prediction block; coding the CU in accordance with thequantity of prediction block groups such that the CU comprises thequantity of prediction block groups.
 10. The method of claim 1, whereinthe method is performed by one or more processors.
 11. The method ofclaim 1, further comprising: encoding, in the coded video bitstream, oneor more syntax elements that represent prediction information for the CUof video data and one or more syntax elements that represent residualdata for the CU of video data.
 12. The method of claim 1, furthercomprising: decoding, from the coded video bitstream, one or more syntaxelements that represent prediction information for the CU of video dataand one or more syntax elements that represent residual data for the CUof video data.
 13. A device for coding video data, the devicecomprising: a memory configured to store video data; and one or moreprocessors implemented in circuitry and configured to: code, via a codedvideo bitstream, a first syntax element specifying whether a coding unit(CU) of video data is split using intra sub-partition (ISP); code, viathe coded video bitstream and where the first syntax element specifiesthat the CU is split using ISP, a second syntax element specifying asplit type for the CU; and responsive to the split type being verticalsplit or horizontal split: perform splitting of the CU to form a set ofprediction blocks, the prediction blocks including at least a firstprediction block and a second prediction block; group a plurality ofprediction blocks from the set of prediction blocks into a firstprediction block group (PBG); and reconstruct samples of predictionblocks included in the first PBG independently of samples of otherprediction blocks included in the first PBG.
 14. The device of claim 13,wherein to group the plurality of the set of prediction blocks into thefirst PBG, the one or more processors are configured to group a firstplurality of the set of prediction blocks into the first PBG, andwherein the one or more processors are further configured to: group asecond plurality of the set of prediction blocks into a second PBG, thesecond plurality of the set of prediction blocks and the first pluralityof the set of prediction blocks including a non-overlapping set ofprediction blocks; and reconstruct samples of prediction blocks includedin the second PBG independently of samples of other prediction blocksincluded in the second PBG.
 15. The device of claim 14, wherein toreconstruct samples of prediction blocks included in the second PBG, theone or more processors are configured to reconstruct samples ofprediction blocks included in the second PBG based on samples of theprediction blocks included in the first PBG.
 16. The device of claim 13,wherein the first PBG comprises a sample size width of at least four.17. The device of claim 13, wherein a first transform block associatedwith the first PBG has a size equal to the size of the first predictionblock.
 18. The device of claim 13, wherein to reconstruct the samples ofprediction blocks, the one or more processors are configured to:generate a prediction sample array having a width equal to an aggregatedwidth of at least the first prediction block and the second predictionblock.
 19. The device of claim 13, wherein the first prediction blockcomprises a sample width of less than or equal to two.
 20. The device ofclaim 19, wherein the first prediction block and the second predictionblock are of a same width.
 21. The device of claim 13, wherein the oneor more processors are further configured to: determine a quantity ofprediction block groups based on a width of the coding unit and a widthof the first prediction block; and code the CU in accordance with thequantity of prediction block groups such that the CU comprises thequantity of prediction block groups.
 22. The device of claim 13, whereinthe one or more processors are included in a video encoder.
 23. Thedevice of claim 13, wherein the one or more processors are included in avideo decoder.
 24. A non-transitory computer-readable storage mediumhaving stored thereon instructions that, when executed, cause one ormore processors to: code, via a coded video bitstream, a first syntaxelement specifying whether a coding unit (CU) of video data is splitusing intra sub-partition (ISP); code, via the coded video bitstream andwhere the first syntax element specifies that the CU is split using ISP,a second syntax element specifying a split type for the CU; andresponsive to the split type being vertical split or horizontal split:perform splitting of the CU to form a set of prediction blocks, theprediction blocks including at least a first prediction block and asecond prediction block; group a plurality of prediction blocks from theset of prediction blocks into a first prediction block group (PBG); andreconstruct samples of prediction blocks included in the first PBGindependently of samples of other prediction blocks included in thefirst PBG.
 25. An apparatus for coding video data, the apparatusincluding: means for coding, via a coded video bitstream, a first syntaxelement specifying whether a coding unit (CU) of video data is splitusing intra sub-partition (ISP); means for coding, via the coded videobitstream and where the first syntax element specifies that the CU issplit using ISP, a second syntax element specifying a split type for theCU; means for performing, responsive to the split type being verticalsplit or horizontal split, splitting of the CU to form a set ofprediction blocks, the prediction blocks including at least a firstprediction block and a second prediction block; means for grouping,responsive to the split type being vertical split or horizontal split, aplurality of prediction blocks from the set of prediction blocks into afirst prediction block group (PBG); and means for reconstructing,responsive to the split type being vertical split or horizontal split,samples of prediction blocks included in the first PBG independently ofsamples of other prediction blocks included in the first PBG.
 26. Themethod of claim 1, wherein the split type is one of: no split, verticalsplit, and horizontal split.
 27. The method of claim 1, wherein thefirst syntax element comprises a intra_subpartitions_mode_flag syntaxelement, and wherein the second syntax element comprises aintra_subpartitions_split_flag syntax element.
 28. The method of claim1, wherein splitting the CU comprises: splitting a transform block ofthe CU into the set of prediction blocks.